Office i>f. la propriete 

INTKLLIiCTl'ELLC I>1! CaNAHA 



O P 1 c 




(i2)d9)(CA) Demande-Application 



(2IMA1) 2,252,064 

(22) 1998/11/20 
(43) 2000/05/20 



(72) BURRELL, Paul Christopher, AU 
(72) BLACK ALL, Linda Louise, AU 
(72) KELLER, Jurg, AU 

(71) CRC FOR WASTE MANAGEMENT AND POLLUTION CONTROL 
LIMITED, AU 

(51) Int.Cl. 6 C12N 1/20, CI 2Q 1/68, C02F 3/34, C07H 21/04 

(54) MICROORGANISMES OXYDANT LES NITRITES DANS L' EAU 

(54) AQUATIC NITRITE OXIDISING MICROORGANISMS 



(57) The invention relates to the nitrification of wastewater and identification of microorganisms capable of 
participating in this process. Specifically, the invention provides a consortium of microorganisms capable of nitrite 
oxidation in wastewater, which consortium is enriched in members of the Nitrospira phylum. The invention also 
provides oligonucleotide primers and probes for the amplification or detection of DNA, kits comprising the primers 
and probes, and methods of detection and quantitating species in a sample. 



1+1 



,nH, jstrie Canada Industry Canada 



BNSDOCID: <CA 2252064A1_I_> 



CA 02252064 1998-11-20 



ABSTRACT 



The invention relates to the nitrification of wastewater and identification of microorganisms 
capable of participating in this process. Specifically, the invention provides a consortium of 
5 microorganisms capable of nitrite oxidation in wastewater, which consortium is enriched in members 
of the Nitrospira phylum. The invention also provides oligonucleotide primers and probes for the 
amplification or detection of Nitrospira DNA, kits comprising the primers and probes, and methods of 
detection and quantitating Nitrospira species in a sample. 
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AQUATIC NITRITE OXIDISING MICROORGANISMS 
TECHNICAL FIELD 

This invention relates to the removal of nitrogenous compounds from wastewater. In particular, 
the invention relates to an isolated consortium of microorganisms capable of nitrification of wastewater. 
5 The invention also relates to methods of identifying microorganisms capable of nitrification of 
wastewater and oligonucleotide primers and DNA probes suitable for use in the methods. 

INTRODUCTION 

The removal of nitrogenous compounds from sewage effluents is an important aspect in the 
remediation of wastewaters. The presence of ammonia, nitrite and nitrate in wastewater discharges 

10 can cause numerous problems ranging from eutrophication (Meganck and Faup, 1988) of the 
receiving aquatic environment to aspects of public health concern such as nitrate contamination of 
drinking water. Nitrogen is biologically removed from wastewaters in a two step process of 
nitrification (ammonia oxidised to nitrate) (Randall, 1992; Robertson and Kuenen, 1991) and 
denitrification (nitrate reduced to dinitrogen gas that dissipates into the atmosphere) (Blackburn, 1983; 

15 Robertson and Kuenen, 1991). Nitrification is the first and most sensitive step of the process and can 
be further subdivided into two steps: ammonia oxidation to nitrite and nitrite oxidation to nitrate. The 
two steps are carried out by separate bacterial groups and for both groups, the total diversity of 
organisms with this phenotype is small. 

Therefore, nitrification is a process where reduced nitrogen compounds, generally ammonium 

20 (NH 4 + ), are microbiologically oxidised to nitrate (N0 3 ) via nitrite (N0 2 ) under aerobic conditions 
(Halling-Sorensen and J0rgensen, 1993). The overall reactions and possible organisms responsible 
are: 

Nitrosomonas 

2NH 4 + + 30 2 ► 2N0 2 " + 2H 2 0 + 4H + + biomass 

Nitrobacter 

2 N0 2 + O, ► 2N0 3 + biomass 

25 The Gram negative chemoautotrophic nitrite oxidising bacteria are physiologically distinct, as 

they all possess the ability to use nitrite as their energy source and to assimilate C0 2 , via the Calvin 
Benson cycle, as a carbon source for cell growth (Bock et ai, 1992). For each molecule of C0 2 
fixed, 100 molecules of nitrite need to be oxidized, emphasising the high energy demands placed on 
these cells. The overall stoichiometry of nitrite oxidation is (Halling-Sorensen and J0rgensen, 1993): 

30 400 N0 2 " + NH 4 + + 4H 2 C0 3 + HC0 3 + 195 0 2 ► C 5 H ? N0 2 + 3H 2 0 + 400 N0 3 

These bacteria can typically also use nitric oxide (NO) instead of N0 2 * as an electron source 
(Bock et aL, 1992). Not all of the known nitrifying bacteria are obligate chemoautotrophs. In fact, 
many strains of Nitrobacter can grow well as heterotrophs, where both energy and carbon are 
obtained from organic carbon sources, or mixotrophically (a combination of both autotrophic and 
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heterotrophic behaviour). These bacteria are collectively known as facultative chemoautotrophs. 
Therefore, bacterial strains can grow three ways; aerobically and autotrophically, aerobically and 
mixotrophically or anaerobically and heterotrophically. In mixotrophic growth, N0 2 is oxidized in 
preference to organic carbon substrates like acetate, pyruvate and glycerol. Both autotrophic and 
heterotrophic growth is usually slow and inefficient. 

As a generalisation, most strains of Nitrobacter seem to be able to grow faster as mixotrophs 
than as heterotrophs and faster heterotrophically or chemo-heterotrophically than 
chemoautotrophically. 

Four genera are currently recognised: Nitrobacter, Nitrospina. Nitrococcus and Nitrospira 
(Halhng-Sorensen and Jorgensen, 1993). Nitrospina and Nitrococcus are unable to grow 
heterotrophically or mixotrophically (Bock et al., 1992). One species of Nitrospira, Nitrospira 
marina, can grow autotrophically and mixotrophically, (Bock et al., 1992) whereas Nitrospira 
moscoviensis is an obligate autotroph (Ehrich, et al., 1995). These nitrite oxidizers have also been 
conventionally classified based on phenotypic characters like their cell shape and the ultrastructure of 
15 their intracytoplasmic membranes. Doubling times of Nitrobacter can range from 12 to 59 hours, or 
even as long as 140 hours (Halling-Sorensen and Jorgensen, 1993). These are therefore very slow 
growing bacteria. 

In wastewater treatment systems, Nitrosomonas (an ammonia oxidizer) and Nitrobacter (a 
nitrite oxidizer) are the two autotrophs presumed to be responsible for nitrification because they are 
the commonest ammonia and nitrite oxidizers isolated from these environments (Halling-Sorensen and 
Jergensen, 1993). Although ammonia oxidizers have been intensively studied by the use of molecular 
methods (Wagner et al., 1995; Wagner et al., 1996), the nitrite oxidizers have not been similarly 
investigated. Since the microorganisms responsible for nitrite oxidation in wastewater treatment plants 
were presumed to be from the genus Nitrobacter, mathematical modeling of the process has used data 
relevant to this genus. However, fluorescent in situ hybridization (FISH) probing of activated sludge 
mixed liquors with Nitrobacter specific probes (Wagner et al., 1996) could not confirm the presence 
of these organisms suggesting that they were not responsible for this major component of nitrogen 
remediation. Indeed, Nitrobacter could not be found in other aquatic environments (Hovanec and 
DeLong, 1996) when specific FISH probes were employed. It was speculated that other bacteria were 
likely responsible for nitrite oxidation (Hovanec and DeLong, 1996; Wagner et al., 1996). 

Knowledge of the microorganisms responsible for nitrification of wastewater is desirable for 
the efficient management of treatment systems. It would also be advantageous to have available 
biomass which can be added to a system to implement or improve nitrification. However, as 
indicated above, there is no certainty in the art as to the actual microorganisms responsible' for 
35 nitrification nor are there methods available for identifying such organisms. 
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SUMMARY OF THE INVENTION 

It is an object of the invention to provide a consortium of microorganisms that can be used for 
nitrification of wastewater. 

A further object of the invention is to provide a method of identifying microorganisms capable 
of nitrification of wastewater. 

According to a first embodiment of the invention, there is provided a consortium of 
microorganisms capable of nitrite oxidation in wastewater, which consortium is enriched in members 
of the Nitrospira phylum. 

According to a second embodiment of the invention, there is provided an oligonucleotide 
primer for PCR amplification of Nitrospira DNA, said primer comprising at least 12 nucleotides 
having a sequence selected from: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; or 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
ID NO: 13. 

According to a third embodiment of the invention, there is provided a primer pair for PCR 
amplification of Nitrospira DNA, said primer pair comprising: 

(a) a first oligonucleotide of at least 12 nucleotides having a sequence selected from one 
strand of a bacterial 16S rDNA gene; and 

(b) a second oligonucleotide of at least 12 nucleotides having a sequence selected from the 
other strand of said 16S rDNA gene downstream of said first oligonucleotide sequence; wherein at 
least one of said first and second oligonucleotides is selected from: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; or 

(ii) a DNA sequence having at least 92% identity with any one SEQ ID NO: 1 to SEQ ID 
NO: 13. 

According to a fourth embodiment of the invention, there is provided a probe for detecting 
Nitrospira DNA, said probe comprising at least 12 nucleotides having a sequence selected from: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; or 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
ID NO: 13. 

According to a fifth embodiment of the invention, there is provided a kit comprising: 

at least one primer according to the second embodiment; 

at least one primer pair according to the third embodiment; or 

at least one probe according to the fourth embodiment. 

According to a sixth embodiment of the invention, there is provided a method of detecting a 
Nitrospira species in a sample, said method comprising the steps of: 
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(a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a primer pair according to the 
third embodiment; 

(c) amplifying Nitrospira DNA by cyclically reacting said primer pair with said DNA to 
5 produce an amplification product; and 

(d) detecting said amplification product. 

According to a seventh embodiment of the invention, there is provided a method of 
quanntating the level of a Nitrospira species in a sample, said method comprising the steps of: 
(a) lysing cells in said sample to release genomic DNA; 

■ 0 (b) contacting denatured genomic DNA from step (a) with a primer pair according to the 

third embodiment; 

(c) amplifying Nitrospira DNA by cyclically reacting said primer pair with said DNA to 
produce an amplification product; and 

(d) detecting said amplification product and quantitating the level of said product by 
1 5 comparison with at least one reference standard. 

According to an eighth embodiment of the invention, there is provided a method of detecting a 
Nitrospira species in a sample, said method comprising the steps of: 

(a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a labeled probe according to 
the fourth embodiment under conditions which allow hybridisation of said genomic DNA said probe- 

(c) separating hybridised labeled probe and genomic DNA from unhybridised labeled 
probe; and 

(d) detecting said labeled probe-genomic DNA hybrid. 

According to a ninth embodiment of the invention, there is provided a method of detecting 
cells of a Nitrospira species in a sample, said method comprising the steps of: 

(a) treating cells in said sample to fix cellular contents; 

(b) contacting said fixed cells from step (a) with a labeled probe according to the fourth 
embodrment under conditions which allow said probe to hybridise with RNA within said fixed cell; 

(c) removing unhybridised probe from said fixed cells; and 
30 < d > detecting said labeled probe-RNA hybrid. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 is a graph showing influent and effluent NO r N concentrations for an automated 
laboratory-scale reactor operating as a sequencing batch reactor at 2 cycles/day with strong selection 
for nitrite oxidising biomass (NOSBR). 
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Figure 2 is a graph showing influent and effluent N0 2 -N concentrations of the NOSBR 
operating at 4 cycles/day. 

Figure 3 is a graph of mixed liquor nitrite-N concentrations during the react period of the 
NOSBR cycle for attached growth and for suspended growth. 

Figure 4 is a graph showing nitrite-N and nitrate-N concentrations in the mixed liquor during 
the react period of the NOSBR. 

Figure 5 ia a graph showing mixed liquor nitrite-N concentrations during the react period in 
three stages of the NOSBR operated at 2 cycles/day with different concentrations of nitrite in the feed. 

Figure 6 is a graph of mixed liquor nitrite-N concentrations during the react period in three 
representative cycles during operation of the NOSBR at 4 cycles/day. 

Figure 7 is an evolutionary distance tree derived from a comparison of 16S rDNA sequences 
from nitrite oxidising bacteria and clone sequences from three different 16S rDNA clone libraries 
(RC, GC, and SBR). 

Figure 8 is an alignment of sequences of 16S rDNA from Nitrospira clones identified in a 
nitrite-oxidising SBR and from other sources. 

Figure 9 depicts the results of agarose gel electrophoresis of PCR-amplified DNA using 
genomic DNA from various Nitrospira clones as template. 

BEST MODE AND OTHER MODES OF CARRYING OUT THE INVENTION 

The following abbreviations are used hereafter: 



SBR 


sequencing batch reactor 


NOSBR 


nitrite oxidising SBR 


NOM 


nitrite oxidising medium 


HRT 


hydraulic retention time 


MLSS 


mixed liquor suspended solids 


BNR 


biological nutrient removal 


DO 


dissolved oxygen 


PCR 


polymerase chain reaction 


REA 


restriction enzyme analysis 


OTU 


operational taxonomic unit 


bp(s) 


base pair(s) 



The one-letter code for nucleotides in DNA conforms to the IUPAC-IUB standard described 
in Biochemical Journal 219, 345-373 (1984). 

The term "comprise", or variations of the term such as "comprises" or "comprising", are 
used herein to denote the inclusion of a stated integer or stated integers but not to exclude any other 
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integer or any other integers, unless in the context or usage an exclusive interpretation of the terms is 
required. 

The present inventors have developed a specific nitrifying biomass that is largely comprised of 
bacteria that are most closely related to Nitrospira moscoviensis. It is believed that a range of species 
of Nitrospira are involved in the process. The inventors have shown . that these bacteria are likely to 
be more dominant in reactors with good nitrification performance than bacteria from the genus 
Nitrobacter. A range of studies have failed to find Nitrobacter in nitrifying processes (Hovanec & 
DeLong, 1996; Wagner et aL, 1996) and evidence is provided below that the organisms responsible 
for this important biochemical reaction in wastewater treatment processes (both suspended and 
attached growth processes) are from the Nitrospira phylum in the domain Bacteria. 

With reference to the first embodiment of the invention, the nitrifying biomass can be 
produced by presenting a feed comprising nitrite, dissolved oxygen and dissolved carbon dioxide but 
which is free of organic carbon to seed sludge from any sewage plant exhibiting nitrification. The 
seed sludge is advantageously from a domestic wastewater treatment plant but can also be from an 
abattoir wastewater treatment plant. The nitrite component of the feed can be as low as about 400 
mg/L nitrite-N. The oxygen and carbon dioxide can conveniently be provided as air bubbled through 
the solution. 

Turning to the second embodiment of the invention, oligonucleotide primers typically have a 
length of about 12 to 50 nucleotides. A preferred length is 12 to 22 nucleotides. Particularly 
20 preferred primers are the following: 

5' CGGGAGGGAAGATGGAGC 3" (SEQ ID NO: 14) 

5' CCAACCCGGAAAGCGCAGAG 3' (SEQ ID NO: 15) 

5 ' AGCCTGGCAGTACCCTCT 3 1 (SEQ ID NO: 1 6) 

Oligonucleotide primer pairs according to the third embodiment of the invention comprise an 
oligonucleotide primer that will anneal to one strand of the target sequence and a second oligonucleotide 
primer which will anneal to the other, complementary, strand of the target sequence. It will be 
appreciated that the second oligonucleotide primer must anneal to the complementary strand downstream 
of the first oligonucleotide primer sequence, which occurs in the complementary strand, to yield a double 
stranded amplification product in the PGR. The amplification product is of a size that facilitates 
detection. Typically, the first and second oligonucleotide primer sites in the target DNA are separated 
by 50 to 1 ,400 bps. A preferred separation is 400 to 1 ,000 bps. 

The probes of the fourth embodiment, as indicated above, can have a size as small as 12 
nucleotides. Typically, however, probes have a length of 15 to 50 nucleotides. A preferred probe 
length is 15 to 22 nucleotides, particularly for in situ hybridisation according to the method of the ninth 
35 embodiment. 
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The oligonucleotide primers included in kits according to the fifth embodiment of the invention 
can be individual oligonucleotide primers appropriate for the detection of Nitrospira or a primer pair. 
Oligonucleotide primer pairs are advantageously provided as compositions. Additional oligonucleotide 
primers can also be included in kits for use in control reactions. For detection purposes, DNA probes 
5 can also be included in kits. 

Kits according to the fifth embodiment of the invention can further comprise reagents used in 
PCR and hybridisation reactions. Such reagents include buffers, salts, detergents, nucleotides and 
thermostable polymerase. Such reagents are advantageously provided as solutions to facilitate execution 
of PCR or hybridisation. Solutions can be compositions comprising a number of reagents as is well 
1 0 known in the art. 

The general techniques used in the methods of the sixth to ninth embodiments, and factors to 
be considered in selecting PCR primers and probes, will be known to those of skill in the art. Such 
techniques are described, for example, in Sambrook et al. (1989) and Stackebrandt and Goodfellow 
(1991), the entire contents of which are incorporated herein by cross reference. Particularly relevant 
15 chapters in Stackebrandt and Goodfellow are Chapter 7, "The Polymerase Chain Reaction" by S. 
Giovannoni, and Chapter 8, "Development and Application of Nucleic Acid Probes" by D. A. Stohl 
and R. Amann. 

Non-limiting examples of the invention will now be provided. 
General Methods 

20 The total community DNAs from the NOSBR sludge (RC) and the seed sludge (GC) were 

isolated, the 16S rDNAs were polymerase chain reaction (PCR) amplified and cloned using previously 
published methods (Blackall, 1994; Blackall et aL, 1994; Bond et al, 1995). Inserts from 102 clones 
in the RC library were amplified and grouped by Haelll restriction enzyme digestion banding profiles 
(REA) into operational taxonomic units (OTUs) (Weidner et al. y 1996). Clone inserts from 

25 representatives of RC OTUs and all 77 clones from the GC library were PCR amplified and partially 
sequenced (Blackall, 1994) using 530f (Lane, 1991) primer. Inserts from a selection of clones were 
fully sequenced (Blackall, 1994). Sequence data were analysed according to previously published 
methods (Blackall et aL, 1994) which included BLAST (Altschul et aL, 1990) comparisons and 
phylogenetic analyses (Felsenstein, 1993). 

30 Example 1 

Selection of a Nitrifying Biomass 
In this example, we describe the use of a laboratory-scale reactor as a sequencing batch 
reactor (SBR) with strong selection for a nitrite oxidising biomass. Seed sludge was from the 
Merrimac domestic wastewater treatment plant operated by the Gold Coast City Council and located 
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at Merrimac, Queensland 4226, Australia. The reactor set-up will be hereafter referred to as the 
"Nitrite Oxidising SBR", or "NOSBR". 

Reactor. A laboratory chemostat with a working volume of 1 L was operated in the dark at 
24"C as the NOSBR. The influent nitrite oxidising medium (NOM) was a synthetic waste water mix 
comprising per L: 400 to 3,200 mg KN0 2 , 3.75 g MgS0 4 .7H 2 O, 250 mg CaCl,.2H,0, 10 g 
KH 2 P04, 10 g K 2 HP0 4 , 200 mg FeS0 4 .7H 2 0, and 20 g NaHCO,. The pH of the medium was 
adjusted to 7.0, but the reactor was not equipped with pH control. Dissolved oxygen was maintained 
at 1.6-2.0 mg/L and CO, was introduced by bubbling air through the liquid in the NOSBR. Surface 
biomass growth was precluded by regular scrubbing of all solid surfaces with a brush. Four cycles 
per day giving a hydraulic retention time (HRT) of 12 hr were performed with the following 
sequences:- 

1) Feed of 500 ml of fresh medium - 30 min (0 to 0.5 hr) 

2) React (aeration) - 4.5 hr (0.5 to 5 hr) 

3) Settle - 40 min (5 to 5.7 hr) 

1 5 4 > Decant 500 ml of supernatant - 20 min (5.7 to 6 hr) 

5) Total time per cycle - 6 hr. 

Automatic timers controlled the magnetic stirrer (100 rpm), peristaltic pumps (feed and decant), 
and air pump for the cycles. Sludge biomass was not wasted from the reactor, but periodically,' 
biomass was collected for testing which facilitated maintenance of a relatively steady amount of 
20 biomass in the SBR. 

At start up, 1 L of mixed liquor suspended solids (MLSS) from a full scale Biological Nutrient 
Removal (BNR, nitrogen and phosphorus removal) plant was added to the NOSBR which was 
operated manually with the NOM. Initial manual and then automatic operation with 2-cycles per day 
(feed - [500 ml] 40 min; react - 10 hr; settle - 40 min; and decant [500 ml] - 40 min) occurred for 
25 some months before initiation of the 4-cycles per day scheme (see above). 

Monitoring. Chemical analyses of feed, mixed liquor and effluent were regularly done for 
nitrite-N (N0 2 -N), nitrate-N (N0 3 -N), and ammonium-N (NH/-N) using spectrometry assays 
(Merck, Melbourne, Australia). To preclude the removal of excessive biomass, these analyses were 
done with 2 ml samples. The MLSS of the NOSBR was determined in duplicate 10 ml samples of 
mixed liquor. These were filtered onto pre-dried Whatman GF/C filters, and then dried to a constant 
weight at 105 degree C. A pH meter was used to periodically monitor pH in the mixed liquor and 
effluent. A portable dissolved oxygen (DO) meter and probe were used to periodically monitor the 
DO in the NOSBR. 

Result, of operation. Varying influent nitrite levels were employed to study a range of features 
35 of the selected nitrite oxidising biomass. The operating data for the influent and effluent nitrite levels 
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of the NOSBR during the automated 2 cycles/day period are presented in Figure 1 and for the 
automated 4 cycles/day in Figure 2. The data presented in these figures show that the microbial 
community are able to remove all the nitrite from the influent in a matter of hours. 
Attributes of the NOSBR mixed liquor 
5 /. Suspended versus attached growth - 2 cycles/day. To generate attached growth, the regular 

scrubbing regime of the reactor was suspended for two weeks. The vast bulk of the biomass was then 
attached to surfaces in the reactor. The little remaining suspended biomass was discharged from the 
. reactor which was then filled with 1 L of half strength NOM. Regular sampling and nitrite analyses 
were done during the react period of one cycle with all the biomass attached to the reactor surfaces. 

10 The results of this experiment are presented in Figure 3. The results show that suspended biomass has 
twice the nitrite oxidation rate than the attached biomass but both systems are effective in removing 
nitrite from the influent. 

Following the experiment described in the previous paragraph, the biomass was completely 
scrubbed from the surfaces to the liquid. The reactor was operated for two cycles with biomass 

15 scrubbing. A similar one-cycle study was performed as with the attached growth but with all biomass 
suspended. The biofilm growth exhibited a nitrite oxidation rate of 29 mg N0 2 -N/hr and the 
suspended growth form showed a rate of 58 mg N0 2 -N/hr. It was assumed that the biomass 
concentration was the same for both studies since none had been removed between them. 

2. pH correlation with nitrification. It was observed that when the pH of the effluent fell 
20 below 7.4, nitrite-N was present in the effluent. If the pH rose above 7.4 for short periods, no effect 

to nitrification was observed. Therefore, pH values below 7.4 were detrimental to nitrification. 

3. Cyclic studies. Figure 4 shows the results for periodic measurements of nitrite-N and 
nitrate-N during the react period of the reactor during 2 cycles/day The results presented in these 
figures show that the bacterial population in the reactor oxidised nitrite to nitrate in a stoichiometric 

25 manner with 160 mg/1 of nitrite-N being oxidised to 160 mg/1 of nitrate-N (170 mg/1 at the start of the 
react period and 330 mg/1 when the nitrite-N was exhausted). The rate of nitrite oxidation and nitrate 
production also appeared to be linear, showing that the oxidation process was not limited by any 
external factors. 

Studies measuring nitrite reaction in the reactor are shown for both 2 cycles/day (Figure 5) 
30 and 4 cycles/day operation (Figure 6). The significance of these results is that the biomass is robust in 
its capacity to oxidise nitrite under a range of operating conditions. 

Example 2 
The Microbiology of the NOSBR 
In this example, we describe the microbiological characterisation of the nitrifying 
35 microorganisms present in the biomass selected in the NOSBR described in Example 1 . Methods used 
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in the characterisation have been described by Blackall (1994) and Bond et al. (1995), the entire 
contents of which disclosures are incorporated herein by cross-reference. 

Total microbial community DNA from both the seed BNR sludge (GC) and from the reactor 
after six months of operation (RC) was obtained. The 16S rDNA from each DNA extract were 
5 separately amplified by polymerase chain reaction (PCR), and then for each, clone libraries were 
prepared (Blackall, 1994; Bond etal., 1995). 

Inserts from a total of 77 clones from the GC clone library were partially sequenced with the 
primer 530f and phylogenetically analysed (Blackall etal., 1994) (Table 1). The majority of the clone 
sequences grouped with the proteobacterial phylum, while 4% (3 clones; GC3, GC86 and GC109) 
1 0 grouped with the phylum Nitrospira. 

Table 1 

Phyla from the Dom ain Bacteria Represented in the GC Clone Library 

Phylum in Domain Bacteria „„„„„, : : — ■ 

Percentage in clone library 

Proteobacteria " 

Alpha 



15 



20 



5 

29 
18 
4 

High mol % G + C Gram positives j Q 



Beta 

gamma 

delta 



Low mol%G+C Gram positives 7 

FlexibacterlCytophagalBacteroides 
Nitrospira 

Planctomycetales 
Unaffiliated 



Restriction Enzyme Analysis (REA) of the RC library was done to group clones into 
operational taxonomic units (OTUs) in advance of partial or complete clone insert sequencing 
(Weidner et al., 1996). Thirteen different OTUs were found when HaelU was employed as the 
restriction enzyme to digest the inserts from 102 clones. The large majority of the clone inserts (88% 
or 90 clones) were found in one OTU while the remaining 12% (12 clones) comprised individuals in 
12 other OTUs. Each of the clone inserts from the latter 12 OTUs and six of the large former group 
(RC7, RCH. RC16, RC25, RC73, and RC99) were partially sequenced and phylogenetically 
analysed. These six and one of the other OTUs (RC90) were found to have partial insert sequences 
that phylogenetically grouped with the Nitrospira phylum. From this analysis, it was concluded that 
91 clones or 89% of the clone library originated from bacteria in the Nitrospira phylum. In the 
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phylogenetic analysis, one of the other OTUs (RC44) grouped with Nitrobacter. It was concluded that 
the organisms responsible for nitrification in the NOSBR were likely to be from the Nitrospira 
phylum. 

Near complete insert sequence analyses were done for the following clones: 
5 - six RC clones of the original partial sequences - RC7, RC11, RC25, RC73, RC90, and RC99 
(RC16 omitted); 

two RC clones from the Nitrospira OTU (RC14 and RC19); 
one of the three GC Nitrospira clones (GC86); and 

four clones from a clone library prepared by Bond et al. (1995) that phylogenetically grouped in 

1 0 the Nitrospira phylum. 

The data were phylogenetically analysed as shown in Figure 7. The two clone clades would 
likely comprise two separate species with the RC clones possibly comprising more than one species. 

Sequences of clones from the two Nitrospira clades were subjected to direct pairwise sequence 
comparison. The results of this comparison are presented in Table 2. The table is a similarity matrix 

15 showing the percent similarity between 16S rDNA sequences of Nitrospira moscoviensis, Nitrospira 
marina and 13 near complete sequences from clone inserts from a full scale biological nutrient 
removal activated sludge plant (GC86), from the NOSBR (RC clone numbers) and from clones for 
which the partial sequences had been previously reported (SBR clones; Bond et al., 1995). The 
similarity matrix showed that the first clade (SBR1015, SBR1024, SBR2046, GC86) had an average 

20 16S rDNA comparison value of 99.4% while for the second clade (RC7, RC11, RC14, RC19, RC25, 
RC73, RC90, RC99, SBR2016), this value was 98.7%. The highest comparative value between an 
RC clone sequence and N. moscoviensis was 93.4% for RC25. From the sequence data analysis, the 
two clone clades would likely comprise two separate species, with the RC clones possibly comprising 
more than one species. 

25 Sequence data for the SBR, GC and RC clones are presented in Figure 8. In this figure, 

sequences are divided into blocks with numbers given in square brackets above each block. The clone 
identification is given at the left of a line of sequence in each block. Dashes represent unknown 
nucleotides while full stops represent alignment breaks. 
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sequences of clones are also presented as sequence listings as follows: 



Clone Sequence Listing Number 

SBR1024 i 

SBR1015 2 

GC86 3 

SBR2046 4 

RC25 5 

RC19 6 

SBR2016 7 

RC7 8 

RC14 9 

RC99 io 

RC11 ii 

RC73 12 

RC90 13 
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Example 3 
Identification of Nitrospira Species 
Primers for use in a diagnostic PCR for the Nitrospira moscoviensis clade of Figure 7 (see 
Example 2) were designed from aligned sequence datasets (see Tables 3-5 below ). 
5 Table 3 is an alignment of 16S rDNA sequences of Nitrospira phylum members and nitrite 

oxidisers from other bacterial phyla which was used to design the primer MOS457f (SEQ ID NO: 14) 
for the Nitrospira moscoviensis clade. In the table, mismatches with the primer sequence are in bold 
type and are underlined. The melting temperature calculated for MOS457f was 60°C and a fragment 
size of approximately 1052 nucleotides was calculated in a PCR with primer !492r. The MOS457f 
10 sequence corresponds to the sequence at positions 440 to 457 of the E. coli 16S rDNA gene. 

Table 3 

Source of Sequence and Number of Sequence in ~~ Sequence Mismatches 
Sequence Listings 



MOS457f primer (SEQ ID NO: 14) 


CGGGAGGGAAGATGGAGC 


- 


Nitrococcus mobilis (SEQ ID NO: 17) 


CAGCCGGGAGGAAAAGCA 


10 


Magnetobacterium bavaricum (SEQ ID NO: 18) 


TGTAGGGAAAGATGATGA 


8 


Nitrobacter hamburgensis (SEQ ID NO: 19) 


TGTGCGGGAAGATAATGA 


7 


Nitrospina gracilis (SEQ ID NO: 20) 


CGGGTGGGAAGAACAAAA 


6 


Nitrospira marina (SEQ ID NO: 21) 


CATGAGGAAAGATAAAGT 


6 


SBR1015 (SEQ ID NO: 22) 


CGGCAGGGAAGATGGAAC 


2 


SBR1024 (SEQ ID NO: 22) 


CGGCAGGGAAGATGGAAC 


2 


SBR2016 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


SBR2046 (SEQ ID NO: 24) 


CCGCAGGGAAGATGGAAC 


3 


RC7 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC11 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC14(SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC 19 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC25 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC73 (SEQ ID NO: 25) 


CGGGAGGGAAGATGGAAC 


1 


RC90 (SEQ ID NO: 25) 


CGGGAGGGAAGATGGAAC 


1 


RC99 (SEQ ID NO: 23) 


CGGGAGGGAAGATGGAGC 


0 


RC44 {Nitrobacter clone) (SEQ ID NO: 26) 


CGTGCGGGAAGATAATGA 


6 


GC86 (SEQ ID NO: 27) 


CGGCAGGGAAGATGGAAC 


2 


Nitrospira moscoviensis (SEQ ID NO: 28) 


CGGGAGGGAAGATGGACG 


2 
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Like Table 3, Table 4 is an alignment of 16S rDNA sequences of Nitrospira phylum members 
and nitrite oxidisers from other bacterial phyla which was used to design the primer MOS638f (SEQ 
ID NO: 15) for the Nitrospira moscoviensis clade. Again, mismatches with the primer sequence are 
in bold and are underlined. The calculated melting temperature for this primer was 66°C and a 
5 fragment size of approximately 873 nucleotides was calculated in a PCR with primer 1492r. The 
MOS638f sequence corresponds to the sequence at positions 619 to 638 of the E. coli 16S rDNA 
gene. 

Table 4 



Source of Sequence and Number of Sequence Sequence Mismatches 

in Sequence Listings 



MOS638fprimer(SEQIDNO: 15) 


CCAACCCGGAAAGCGCAGAG 




Nitrococcus mobilis (SEQ ID NO: 29) 


TCAACCTGGGAATTGCATCC 


8 


Magnetobacterium bavaricum 


TCAACCCGGGAATTGCCTTG 


7 


(SEQ ID NO: 30) 






Nitrobacter hamburgensis (SEQ ID NO: 31) 


TCAACTCCAGAACTGCCTTT 


1 1 


Nitrospina gracilis (SEQ ID NO: 32) 


TCAACCGTGGAATTGCGTTT 


10 


Nitrospira marina (SEQ ID NO: 33) 


TTAACCGGGAAAGGTCGAGA 


9 






7. 
J 


SBR1024 (SEQ ID NO: 34) 


CTAACCCGGAAAGTGCGGAG 


3 


SBR2016 (SEQ ID NO: 35) 


CCAACCCGAAAAGCGCAGAG 


1 


SBR2046 (SEQ ID NO: 34) 


CTAACCCGGAAAGTGCGGAG 


3 


RC7 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC1 1 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC14(SEQIDNO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC19(SEQID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC25 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC73 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC90 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC99 (SEQ ID NO: 36) 


CCAACCCGGAAAGCGCAGAG 


0 


RC44 (Nitrobacter clone) (SEQ ID NO: 37) 


TCAACTCCAGAACTGCCTTT 


11 


GC86 (SEQ ID NO: 34) 


CTAACCCGGAAAGTGCGGAG 


3 


Nitrospira moscoviensis (SEQ ID NO: 38) 


CCAACCCGGAAAGCGCAGAG 


0 



10 Table 5, is again an alignment of 16S rDNA sequences of Nitrospira phylum members and 

nitrite oxidisers from other bacterial phyla which was used to design the primer MOS635r (SEQ ID 
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NO: 16) for the Nitrospira moscoviensis clade. The melting temperature calculated for this primer 
was 58°C and a fragment size of approximately 625 nucleotides was calculated in a PGR with primer 
27f. The MOS635r sequence corresponds to the sequence at positions 635 to 652 of the E. coli 16S 
rDNA sequence. 

Table 5 



Source of Sequence and Number of Sequence in 
Sequence Listings 



Sequence 



Mismatches 



MOS635r primer (SEQ ID NO: 16) 
Nitrococcus mobilis (SEQ ID NO: 39) 
Magnetobacterium bavaricum (SEQ ID NO: 40) 
Nitrobacter hamburgensis (SEQ ID NO: 41) 
Nitrospina gracilis (SEQ ID NO: 42) 
Nitrospira marina (SEQ ID NO: 43) 
SBR1015 (SEQ ID NO: 44) 
SBR1024 (SEQ ID NO: 44) 
SBR2016 (SEQ ID NO: 45) 
SBR2046 (SEQ ID NO: 44) 
RC7 (SEQ ID NO: 46) 

RC1 1 (SEQ ID NO: 45) 

RC14 (SEQ ID NO: 45) 

RC19 (SEQ ID NO: 45) 

RC25 (SEQ ID NO: 47) 

RC73 (SEQ ID NO: 45) 

RC90 (SEQ ID NO: 45) 

RC99 (SEQ ID NO: 45) 

RC44 {Nitrobacter clone) (SEQ ID NO: 48) 

GC86 (SEQ ID NO: 44) 

Nitrospira moscoviensis (SEQ ID NO: 49) 



10 



AGCCTGGCAGTACCCTCT 

AGCCAAACAGTATC GGA T 

AGTTAAACAGTTTTC AAG 

AGACCTTCAGTATCAAAG 

AGCCGAATAGTTTC AAAC 

AGCTGAATAGTTCC TCTC 

AGCCGAGCAGTCCCCTCC 

AGCCGAGCAGTCCCCTCC 

AGCCTGGCAGTACCCTCT 

AGCCGAGCAGTCCCCTCC 

AGCCTGGCAGTACCCCCT 

AGCCTGGCAGTACCCTCT 

AGCCTGGCAGTACCCTCT 

AGCCTGGCAGTACCCTCT 

AGCCTGGCAGTACCGTCT 

AGCCTGGCAGTACCCTCT 

AGCCTGGCAGTACCCTCT 

AGCCTGGCAGTACCCTCT 

AGATCCTCAGTATCAAAG 

AGCCGAGCAGTCCCCTCC 

AGCCTGGCAGTACCCTCT 



7 

11 

9 

10 

10 

4 

4 

0 

4 

1 

0 
0 
0 

1 

0 

0 

0 

10 

4 

0 



The three primers defined above in Tables 3 to 5 were included in separate primer pairs which 
P a irS were then tested in PCR amplifications using genomic DNA from various Nitrospira clones as 
template. The PCRs were carried out according to methods detailed in Sambrook et at. (1989) at an 
annealing temperature of 62 °C. 

The results of electrophoretic analysis of PCRs on an agarose gel are presented in Figure 9 
Detaxls of the material analysed in each lane of the gel are given in Table 6. The marker DNA was 
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HaelW -digested <j)X174 DNA. The sizes of the <j)X174 fragments are given on the left-hand side of the 
figure. 

Table 6 



Lane 


Primer pair used 


Mismatches between 
primer and template 


1 


(/-/tfelll-digested <pX174 DNA) 




Z 


M(Jo4j/t, 14v2r 


0 mismatches with MOS457f 


5 


lvlUb4>/l, 14yzr 


1 mismatch with MOS457f 


4 


ML>!S457i, 1492r 


2 mismatches with MOS457f 


5 


(/foelll-digested (|>X174 DNA) 




6 


MOS638f, 1492r 


0 mismatches with MOS638f 


7 


MOS638f, 1492r 


1 mismatch with MOS638f 


8 


MOS638f, 1492r 


3 mismatches with MOS638f 


9 


(//aelll-digested <})X174 DNA) 




10 


MOS635r, 27f 


0 mismatches with MOS635r 


11 


MOS635r, 27f 


1 mismatch with MOS635r 


12 


MOS635r, 27f 


4 mismatches with MOS635r 



5 The results presented in Figure 9 show that an amplicon of the appropriate size was obtained 

in reactions where there was up to one mismatch between a primer and the template but that no 
amplicon was produced where there was a greater degree of mismatch. 

When the three primer pairs used for the results presented in Figure 9 were used with clone 
RC44 (closest match to Nitrobacter), no amplicons were produced. 

10 The primer NIT3 (Wagner et al. 1996; SEQ ID NO: 50) was used in a diagnostic PCR for 

Nitrobacter. NIT3 was designed originally for fluorescent in situ hybridisation experiments. The 
specificity of this primer can be appreciated from the sequence alignment presented in Table 7 which 
is an alignment of 16S rDNA sequences of Nitrospira phylum members and nitrite oxidisers from 
other bacterial phyla against NIT3. A melting temperature of 60°C was calculated for NIT3 and a 

15 fragment size of approximately 1020 nucleotides in a PCR with primer 27f as experimentally 
determined. The NIT3 sequence corresponds to the sequence at positions 1031 to 1048 of the E.coli 
16S rDNA gene. 
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Table 



Source of Sequence and Number of Sequence in 
Sequence Listings 



Sequence 



Mismatches 



NIT3 primer (SEQ ID NO: 50) 
Nitrobacter hamburgensis (SEQ ID NO: 51) 
Nitrospina gracilis (SEQ ID NO: 52) 
Nitrococcus mobilis (SEQ ID NO: 53) 
Nitrospira moscoviensis (SEQ ID NO: 54) 
Nitrospira marina (SEQ ID NO: 55) 
Magnetobacterium bavaricum (SEQ ID NO: 56) 
SBR1015 (SEQ ID NO: 57) 
SBR1024 (SEQ ID NO: 57) 
SBR2016 (SEQ ID NO: 58) 

SBR2046 (SEQ ID NO: 57) 

RC7 (SEQ ID NO: 58) 

RC1 1 (SEQ ID NO: 58) 

RC14(SEQIDNO: 58) 

RC19 (SEQ ID NO: 58) 

RC25 (SEQ ID NO: 58) 

RC73 (SEQ ID NO: 58) 

RC90 (SEQ ID NO: 58) 

GC86 (SEQ ID NO: 59) 

RC99 (SEQ ID NO: 58) 



CCTGTGCTCCATGCTCCG T 
CCTGTGCTCCATGCTCCG 0 
CCTGTGCAAGGGCCCCGA 9 
CCTGTCATCCGGJTCCCG 7 
CCTGAGCACGCTGGTATT 8 
CCTGAGCTCGCTCCCCTT 7 
CCTGTGCAAGCTCJCCCT 8 

CCTGAGC AGG ATGGT ATT 8 

CCTGAGCAGGATGGTAJT 8 

C CTG AGC ACGCTGGT ATT 8 

CCTGAGC AGG ATGGTATT 8 

CCTGAGCACGCTGGTATT 8 

CCTGAGCACGCTGGTATT 8 

CCTGAGCACGCTGG TATT 8 

CCTGAGCACGCTGGTATT 8 

CCTGAGCACGCTGGTATT 8 

CCTGAGCACGCTGGTATT 8 

CCTGAGCACGCTGGTATT 8 

CCTGAGCAGGATGGTGTT 8 

CCTGAGCACGCTGG TATT 8 



Results of PCRs with the primer pair NIT3 and 27f showed that the NIT3 primer specifically 
amplified only RC44 clone inserts {Nitrobacter) and not those from Nitrospira clones. 

The different primer pairs were then used with DNAs extracted from sludges and the results 
are tabulated below in Table 8. The scorings presented in the table were generated by quantitating by 
eye the intensity of the amplificate in a stained gel. A defmition of the scoring follows: - = no band; 
+/- = very faint band; + through + + + + = increasing intensity of the amplificate. 
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Table 8 



Wastewater Treatment Plant 


Performance 


MOS635r-27f 


NIT3-27f 






620 bp 


1020 bp 


Oxley 


Full nitrification 


+ + + + 


+ + 


Merrirnac 


Full nitrification 


+ + + + 


+ + 


Loganholme 


Full nitrification 


+ + + 


+/- 


Gibson Island 


Full nitrification 


+ + + 




Fairfield 


No nitrification 


+ /- 


+ + + 


Cannon Hill 


Full nitrification 


+ 




NOSBR 


N0 2 * oxidation 


+ + + + + 


-h + + + 


Saline waste water BNR SBR 


Partial nitrification 


+/- 




Nitrifying biofilm reactor 


Full nitrification 


+ + + + 


+ + + + 


Phenol/cyanide removing SBR 


No nitrification 


+/- 


+ + 


BNR SBR 


Full nitrification 


+ 


+ 



These results show that in plants having good nitrification, Nitraspira species were present as 
evidenced by amplification of target DNA with the selected primer pairs. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 
(i) APPLICANT: 



(A) NAME: CRC for Waste Managment and Pollution Control 

Limited 

(B) STREET: High Street 
10 (C) CITY: Kensington 

(D) STATE: New South Wales 

(E) COUNTRY: Australia 

(F) POSTAL CODE (ZIP) : 2033 

15 (ii) TITLE OF INVENTION: Aquatic Nitrite Oxidising Microorganisms 

(iii) NUMBER OF SEQUENCES: 59 

(iv) COMPUTER READABLE FORM: 
20 (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 



25 



(2) INFORMATION FOR SEQ ID NO : 1; 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1428 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Nitrospira 



(i) 



30 



45 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 










CAAGTCGAGC 


GAGAAGACGT 


AGCAATACGT TTGTAAAGCG 


GCGAACGGGT 


GAGGAATACA 


60 


50 


TGGGTAACCT 


ACCTTCGAGT 


GGGGAATAAC TAGCCGAAAG 


GTTAGCTAAT 


ACCGCATACG 


120 




ACTCCTGGTC 


TGCGGATCGG 


GAGAGAAAGC GATACCGTGG 


GTATCGCGCT 


CTTGGATGGG 


180 




CTCATGTCCT 


ATCAGCTTGT 


TGGTGAGGTA ACGGCTCACC 


AAGGCTTCGA 


CGGGTAGCTG 


240 


55 


GTCTGAGAGG 


ACGATCAGCC 


ACACTGGCAC TGCGACACGG 


GCCAGACTCC 


TACGGGAGGC 


300 




AGCAGTAAGG 


AATATTGCGC 


AATGGGCGAC AGCCTGACGC 


AGCNACGCCG 


CGTGGGGGAT 


. 360 
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GAAGGTCTTC GGATTGTAAA CCCCTTTCGG CAGGGAAGAT GGAACGGGTA ACCGTTCGGA 
CGGTACCTGC AGAAGCAGCC ACGGCTAACT TCGTGCCAGC AGCCGCGGTA ATACGAAGGT 
GGCAAGCGTT GTTCGGATTT ACTGGGCGTA CAGGGAGCGT AGGCGGTTGG GTAAGCCCTC 
CGTGAAATCT CCGGGCCTAA CCCGGAAAGT GCGGAGGGGA CTGCTCGGCT AGAGGATGGG 
AGAGGAGCGC GGAATTCCCG GTGTAGCGGT GAAATGCGTA GAGATCGGGA GGAAGGCCGG 
TGGCGAAGGC GGCGCTCTGG AACATTTCTG ACGCTGAGGC TCGAAAGCGT GGGG AG C AAA 
CAGGATTAGA TACCCTGGTA GTCCACGCCT TAAACGATGG ATACTAAGTG TCGGCGGGTT 
ACCGCCGGTG CCGCAGCTAA CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT 
GAAACTCAAA GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
GCAACGCGAA GAACCTTACC CAGGCTGGAC ATGCAGGTAG TAGAAGGGTG AAAGCCTAAC 
GAGGTAGCAA TACCATCCTG CTCAGGTGCT GCATGGCTGT CGTCAGCTCG TGCCGTGAGG 
TGTTGGGTTA AGTCCCGCAA CGAGCGCAAC CCCTGTCTTC AGTTACCAAC GGGTCATGCC 
GGGAACTCTG GAGAGACTGC CCAGGAGAAC GGGGAGGAAG GTGGGGATGA CGTCAAGTCA 
GCATGGCCTT TATG CCTGGG GCCACACACG TGCTACAATG GCCGGTACAA AGCGCTGCAA 
ACCCGTAAGG GGGAGCCAAT CCCAAAAAAC CGGCCTCAGT TCAGATTGAG GTCTGCAACT 
CGACCTCATG AAGGCGGAAT CGCTAGTAAT CCCGGATCAG CACGCCGGGG TGAATACGTN 
CCCGGGCCTT GTACACACCG CCCGTCACAC CACGAAAGTT TGTTGTACCT GAAGTCGTTG 
GCGCCAACCG CAAGGAGGCA GACGCCCACG GTATGACCGA TGATTGGG 
(2) INFORMATION FOR SEQ ID NO : 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1407 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
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20 



60 
120 
180 
240 
300 
360 



540 
600 
660 
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TAATACATGC AAGTCGAGCG AGAAGACGTA GCAATACGTT TGTAAAGCGG CGAACGGGTG 
AGGAATACAT GGGTAG CCTA CCCTCGAGTG GGGAATAACT AACCGAAAGG TTAGCTAATA 
CCGCATACGG CTCCTGGTCT GCGGATCGGG AGAGAAAGCG ATACCGTGGG TATCGCG CTC 
TTGGATGGGC TCATGTCCTA TCAGCTTGTT GGTGAGGTAA CGGCTCACCA AGGCTTCGAC 
GGGTAGCTGG TCTGAGAGGA CGATCAGCCA CACTGGCACT GCGACACGGG CCAGACTCCT 
ACGGGAGGCA GCAGTAAGGA ATATTGCGCA ATGGGCGACA GCCTGACGCA GCNACGCCGC 
GTGGGGGATG AAGGTCTTCG GATTGTAAAC CCCTTTCGGC AGGGAAGATG GAACGGGTAA 
CCGTTCGGAC GGTACCTGCA GAAGCAGCCA CGGCTAACTT CGTGCCAGCA GCCGCGGTAA 
TACGAAGGTG GCAAGCGTTG TTCGGATTTA CTGGGCGTAC AGGGAGCGTA GGCGGTTGGG 
TAAGCCCTCC GTGAAATCTC CGGGCCTAAC CCGGAAAGTG CGGAGGGGAC TGCTCGGCTA 
GAGGATGGGA GAGGAGCGCG GAATTCCCGG TGTAGCGGTG AAATGCGTAG AGATCGGGAG 
GAAGGCCGGT GGCGAAGGCG GCGCTCTGGA ACATTTCTGA CGCTGAGGCT CGAAAGCGTG 720 
GGGAGCAAAC AGGATTAGAT ACCCTGGTAG TCCACGCCTT AAACGATGGA TACTAAGTGT 780 
CGGCGGGTTA CCGCCGGTGC CGCAGCTAAC GCATTAAGTA TCCCGCCTGG GAAGTACGGC 84 0 

30 CGCAAGGTTG AAACTCAAAG GAATTGACGG GGGCCCGCAC AAGCGGTGGA GCATGTGGTT 900 
TAATTCGACG CAACGCGAAG AACCTTACCC AGGCTGGACA TGCAGGTAGT AGAAGGGTGA 960 
35 AAGCCTAACG AGGTAGCAAT ACCATCCTGC TCAGGTGCTG CATGGCTGTC GTCAGCTCGT 1020 
GCCGTGAGGT GTTGGGTTAA GTCCCGCAAC GAGCGCAACC CCTGTCTTCA GTTACCAACG 1080 
GGTCATGCCG GGAACTCTGG AGAGACTGCC CAGGAGAACG GGGGAGGAAG GTGGGGATGA H 40 
40 CGTCAAGTCA GCATGGCCTT TATGCCTGGG GCCACACACG TGCTACAATG GCCGGTACAA 1200 
AGCGCTGCAA ACCCGTAAGG GGGAGCCAAT CGCAAAAAAC CGGCCTCAGT TCAGATTGAG 1260 
45 GTCTGCAACT CGACCTCATG AAGGCGGAAT CGCTAGTAAT CCCGGATCAG CACGCCGGGG 1320 
TGAATACGTN CCCGGACCTT GTACACACCG CCCGTCACAC CACGAAAGTT TGTTGTACCT l 380 
GAAGTCGTTG GCGCCAACCG CAAGGAG 

1407 

50 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1500 base pairs 
< B > TYPE: nucleic acid 
" (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

5 (iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

TTGATCCTGG CTCAGAACGA ACGCTGGCGG CGCGCCTAAT ACATGCAAGT CGAGCGAGAA 6 0 

15 

GACGTAGCAA TACGTTTGTA AAGCGGCGAA CGGGTGAGGA ATACATGGGT AACCTACCCT 12 0 

CGAGTGGGGA ATAACTAGCC GAAAGGTTAG CTAATACCGC ATACGACTCC TGGTCTGCGG 18 0 

20 ATCGGGAGAG .AAAG CGAT AC CGTGGGTATC GCGCTCTTGG ATGGGCTCAT GTCCTATCAG 24 0 

CTTGTTGGTG AGGTAACGGC TCACCAAGGC TTCGACGGGT AGCTGGTCTG AGAGGACGAT 3 00 

CAGCCACACT GGCACTGCGA CACGGGCCAG ACTCCTACGG GAGGCAGCAG TAAGGAATAT 3 60 

25 

TGCGCAATGG GCGACAGCCT GACGCAGCNA CGCCGCGTGG GGGATGAAGG TCTTCGGATT 42 0 

GTAAACCCCT TTCGGCAGGG AAGATGGAAC GGGTAACCGT TCGGACGGTA CCTGCAGAAG 48 0 

30 CAGCCACGGC TAACTTCGTG CCAGCAGCCG CGGTAATACG AAGGTGGCAA GCGTTGTTCG 54 0 

GATTTACTGG G CGTAC AGGG AGCGTAGGCG GTTGGGTAAG CCCTCCGTGA AATCTCCGGG 600 

CCTAACCCGG AAAGTGCGGA GGGGACTGCT CGGCTAGAGG ATGGGAGAGG AGCGCGGAAT 66 0 

35 

TCCCGGTGTA GCGGTGAAAT GCGTAGAGAT CGGGAGGAAG GCCGGTGGCG AAGGCGGCGC 720 

TCTGGAACAT TTCTGACGCT GAGGCTCGAA AGCGTGGGGA GCAAACAGGA TTAGAT AC C C 780 

40 TGGTAGTCCA CGCCTTAAAC GATGGATACT AAGTGTCGGC GGGTTACCGC CGGTGC CGCA 84 0 

GCTAACGCAT TAAGTATCCC GCCTGGGAAG TACGGCCGCA AGGTTGAAAC TCAAAGGAAT 90 0 

TGACGGGGGC CCGCACAAGC GGTGGAGCAT GTGGTTTAAT TCGACGCAAC GCGAAGAACC 96 0 

45 

TTACCCAGGC TGGACATGCA GGTAGTAGAA GGGTGAAAGC CTAACGAGGT AGCAAC AC C A 102 0 

TCCTGCTCAG GTGCTGCATG GCTGTCGTCA GCTCGTGCCG TGAGGTGTTG GGTTAAGTCC 108 0 

50 CGCAACGAGC GCAACCCCTG TCTTCAGTTA CCAACGGGTC ATGCCGGGAA CTCTGGAGAG 114 0 

ACTGCCCAGG AGAACGGGGA GGAAGGTGGG GATGACGTCA AGTCAGCATG GCCTTTATGC 120 0 

CTGGGGCCAC ACACGTGCTA CAATGGCCGG TACAAAGCGC TGCAAACCCG TAAGGGGGAG 12 6 0 

55 

CCAATCGCAA AAAACCGGCC TCAGTTCAGA TTGAGGTCTG CAACTCGACC TCATGAAGGC 132 0 
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GGAATCGCTA GTAATCCCGG ATCAGCACGC CGGGGTGAAT ACGTNCCCGG GCCTTGTACA 
CACCGCCCGT CACACCACGA AAGTTTGTTG TACCTGAAGT CGTTGGCGCC AACCGCAAGG 
GGGCAGACGC CCACGGTATG ACCGATGATT GGGGTGAAGT CGTAACAAGG TAACCGTAAC 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1420 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4: 
CGAGAAGACG TAGCAATACG TTTGTAAAGC GGCGAACGGG TGAGGAATAC ATGGGTAACC 
TACCCTCGAG TGGGGAATAA CTAACCGAAA GGTTAGCTAA TACCGCATAC GGCTCCTGGT 
CTGCGGATCG GGAGAGAAAG CGATACCGTG GGTATCGCGC TCTTGGATGG GCTCATGTCC 
TATCAGCTTG TTGGTGAGGT AACGGCTCAC CAAGGCTTCG ACGGGTAGCT GGTCTGAGAG 
GACGATCAGC CACACTGGCA CTGCGACACG GGCCAGACTC CTACGGGAGG CAGCAGTAAG 
GAATATTGCG CAATGGGCGA CAGCCTGACG CAGCGACGCC GCGTTGGGGA TGAAAGTCTT 
CCGATTGTAA ACCCCTTTCC GCAGGGAAGA TGGAACGGGT AACCGTTCGG ACGGTACCTG 
CAGAAGCAGC CACGGCTAAC TTCQTGCCAG CAGCCGCGGT AATACGAAGG TGGCAAGCGT 
TGTTCGGATT TACTGGGCGT ACAGGGAGCG TAGGCGGTTG GGTAAGCCCT CCGTGAAATC 
TCCGGGCCTA ACCCGGAAAG TGCGGAGGGG ACTGCTCGGC TAGAGGATGG GAGAGGAGCG 
CGGAATTCCC GGTGTAGCGG TGAAATGCGT AGAGATCGGG AGGAAGGCCG GTGGCGAAGG 
CGGCGCTCTG GAACATTTCT GACGCTGAGG CTCGAAAGCG TGGGGAGCAA ACAGGATTAG 
ATACCCTGGT AGTCCACGCC TTAAACGATG GATACTAAGT GTCGGCGGGT TACCGCCGGT 
GCCGCAGCTA ACGCATTAAG TATCCCGCCT GGGAAGTACG GCCGCAAGGT TGAAACTCAA 
AGGAATTGAC GGGGCCCCGC ACAAGCGGTG GAGCATGTGG TTTAATTCGA CGCAACGCGA 
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AGAACCTTAC CCAGGCAGGA CATGCAGGTA GTAGAAGGGT GAAAGCCTAA CGAGGTAGCA 
ATACCATCCT GCTCAGGTGC TGCATGGCTG TCGTCAGCTC GTGCCGTGAG GTGTTGGGTT 
AAGTCCCGCA ACGAGCGCAA CCCCTGTCTT CAGTTACCAA CGGGTCATGC CGGGAACTCT 
GGAGAGACTG CCCAGGAGAA CGGGGAGGAA GGTGGGGATG ACGTCAAGTC AGCATGGCCT 
TTATGCCTGG GGCCACACAC GTGCTACAAT GGCCGGTACA AAGCGCTGCA AACCCGTAAG 
GGGGAGCCAA TCGCAAAAAA CCGGCCTCAG TTCAGATTGA GGTCTGCAAC TCGACCTCAT 
GAAGGCGGAA TCGCTAGTAA TCCCGGATCA GCACGCCGGG GTGAATACGT NCCCGGGCCT 
TGTACACACC GCCCGTCACA CCACGAAAGT TTGTTGTACC TGAAGTCGTT GGCGCCAACC 
GCAAGGAGGC AGACGCCCAC GGTATGACCG ATGATTGGGG 
(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 0 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 
AGAGTTTGAT CCTGGCTCAG AACGAACGCT GGCGGCGCGC CTAATACATG CAAGTCGAGC 
GAGAAGACGT AG CAATACGT TTGTAAAGCG GCGAACGGGT GAGGAATACA TGGGTAATCT 
AC CATCGAGT GGGGAATAAC CAACCGAAAG GTTGGCTAAT ACCGCGTACG CTTCTGAGTC 
TTCGGGTTCG GAAGGAAAGC CGTACTGTGA GTGCGGCGCT CTTTGATGAG CTCATGTCCT 
ATC AG CTTGT TGGTAGGGTA ACGGCCTACC AAGGCTTTGA CGGGTAGCTG GTCTGAGAGG 
ACGATCAGCC ACACTGGCAC TGCGACACGG GCCAGACTCC TACGGGAGGC AGCAGTAAGG 
AATATTGCGC AATGGG CGAA AGCCTGACGC AGCNACGCCG CGTGGGGGAT GAAGGTCTTC 
GGATTGTAAA CCCCTTTCGG GAGGGAAGAT GGAGCG AG C A ATCGTTCGGA CGGTACCTCC 
AGAAG CAGCC ACGGCCAACT TCGTGCCAGC AGCCGCGGTA ATACGAAGGT GGCAAGCGTT 

2252064A1J_> 
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20 



25 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1441 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



45 
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GTTCGGATTC ACTGGG CGTA CAGGGTGTGT AGGCGGTTTG GTAAGCCTTC TGTTAAAGCT 600 

TCGGGCCCAA CCCGGAAAGC GCAGACGGTA CTGCCAGGCT AGAGGGTGGG AGAGGAG CGC 6 60 

GGAATTCCCG GTGTAGCGGT GAAATGCGTA GAGATCGGGA GGAAGGCCGG TGGCGAAGGC 72 0 

GGCGCTCTGG AACATACCTG ACG CTG AG AC ACGAAAGCGT GGGGAGCAAA CAGGATTAGA 780 

10 TACCCTGGTA GTCCACGCCC TAAACTATGG ATACTAAGTG TCGGCGGGTT ACCGCCGGTG 84 0 

CCGCAGCTAA CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 90 0 

GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC GCAACGCGAA 96 0 

GAACCTTACC CAGGTTGGAC ATGCACGTAG TAGAAAGGTG AAAGCCTGAC GAGGTAGCAA 102 0 

TACCAGCGTG CTCAGGTGCT GCATGGCTGT CGTCAGCTCG TGCCGTGAGG TGTTGGGTTA 108 0 

AGTCCCGCAA CGAGCGCAAC CCCTGCTTTC AGTTGCTACC GGGTCATGCC GAGCACTCTG 114 0 

AAAGGACTGC CCAGGATAAC GGGGAGGAAG GTGGGGATGA CGTCAAGTCA GCATGGCCTT 12 0 0 

TATGCCTGGG GCCACACACG TGCTACAATG GCCGGTACAA AGCGCTGCAA AC CCGTGAGG 126 0 

GGGAGCCAAT CGCAAAAAAC CGGCCTCAGT TCAGATTGAG GTCTGCAACT CGACCTCATG 132 0 

AAGGCGGAAT CGCTAGTAAT CGCGGATCAG CACGCCGCGG TGAATACGTN CCCGGGCCTT 13 8 0 

30 GTACACACCG CCCGTCACAC CACGAAAGCC TGTTGTACCT GAAGTCGCCC AAGCCAACCG 14 4 0 

CAAGGAGGCA GGCGCCCACG GTATGGCCCG TGATTGGGGT GAAGTCGTAA CAAGGTAACC 1500 
GTAAA 

(2) INFORMATION FOR SEQ ID NO: 6: 



1505 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

50 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AAGTCGAGCG AGAAGGTGTA GCAATACACT TGTAAAGCGG CGAACGGGTG AGGAATACAT 6 0 
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GGGTAATCTA CCATCGAGTG GGGAATAACC AGCCGAAAGG TTGGCTAATA CCGCGTACGC 12 0 

TTCCGAGTCT TCGGGCTTGG AAGGAAAGCC GCACTGTGAG TGCGGCGCTC TTTGATGAGC 180 

5 

TCATGTCCTA TCAGCTTGTT GGTAGGGTAA CGGCCTACCA AGGCTTTGAC GGGTAGCTGG 24 0 

TCTGAGAGGA CGATCAGCCA CACTGGCACT GCGACACGGG CCAGACTCCT ACGGGAGGCA 3 00 

10 GCAGTAAGGA ATATTGCGCA ATGGGCGAAA GCCTGACGCA GCGACGCCGC GTGGGGGATG 360 

AAGGTCTTCG GATTGTAAAC CCCTTTCGGG AGGGAAGATG GAGCCAGCAA TCGTTCGGAC 420 

GGTACCTCCA GAAGCAGCCA CGGCCAACTT CGTGCCAGCA GCCGCGGTAA TACGAAGGTG 4 80 

15 

GCAAGCGTTG TTCGGATTCA CTGGGCGTAC AGGGTGTGTA NGCGGTTTGG TAAGCCTTCT 54 0 

GTTAAAGCTT CGGGCCCAAC CCGGAAAGCG CAGAGGGTAC TGCCAGGCTA GAGGGTGGGA 6 00 

20 GAGGAGCGCG GAATTCCCGG TGTAGCGGTG AAATGCGTAG AGATCGGGAG GAAGGCCGGT 660 

GGCGAAGGCG GCGCTCTGGA ACATGCCTGA CGCTGAGACA CGAAAGCGTG GGGAGCAAAC 72 0 

AGGATTAGAT ACCCTGGTAG TCCACGCCCT AAACTATGGA TACTAAGTGT CGGCGGGTTA 780 

25 

CCGCCGGTGC CGCAGCTAAC GCATTAAGTA TCCCGCCTGG GAAGTACGGC CGCAAGGTTG 84 0 

AAACTCAAAG GAATTGACGG GGGCCCGCAC AAGCGGTGGA GCATGTGGTT TAATTCGACG 900 

30 CAACGCGAAG AACCTTACCC AGGTTGGACA TGCACGTAGT AGAAAGGTGA AAGNCTAACG 960 

AGGTAGCAAT AC CAGCGTGC TCAGGTGCTG CATGGCTGTC GTCAGCTCGT GCCGTGAGGT 102 0 

GTTGGGTTAA GTCCCGCAAC GAGCGCAACC CCTGCTTTCA GTTGCTACCG GGTCATGCCG 10 8 0 

35 

AGCACTCTGA AAGGACTGCC CAGGATAACG GGGAGGAAGG TGGGGATGAC GTCAAGTCAG 114 0 

CATGGCCTTT ATGCCTGGGG CCACACACGT GCTACAATGG CCGGTACAAA GCGCTGCAAA 12 0 0 

40 CCCGTGAGGG GGAGC CAATC GCAAAAAACC GGCCTCAGTT CAGATTGAGG TCTGCAACTC 12 6 0 

GACCTCATGA AGGCGGAATC GCTAGTAATC GCGGATCAGC ACGCCGCGGT GAATACGTNC 132 0 

CCGGGCCTTG TACACACCGC CCGTCACACC ACGAAAGCCT GTTGTACCTG AAGTCGCCCA 1380 

45 

AGCCAACCGC AAGGAGGCAG GCGCCCACGG TATGGCCGGT GATTGGGGTG AAGTCCTAAC 144 0 

A 1441 
.50 (2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1426 base pairs 

(B) TYPE : nucleic acid 
55 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

5 (iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



15 


TAATACATGC 


AAGTCGAGCG 


AGAAGGTGTA 


GCAATACACT 


TGTAAAGCGG 


CGAACGGGTG 




AGGAATACAT 


GGGTAATCTA 


CCATCGAGTG 


GGGAATAACC 


AACCGAAAGG 


TTGGC TAATA 


ion 

X. \J 




CCGCGTACGC 


TTCTGAGCCT 


TCGTGTTCGG 


AAGGAAAGCC 


GTACTGTGAG 


TGCGGGGCTC 


inn 

X o u 


20 


TTTGATGAGC 


TCATGTCCTA 


TCAGCTTGTT 


GGTAGGGTAA 


CGGCCTACCA 


AfSOC'TTTfS A C 

""JvJv. -1 A J. V_J.rt.V_ 


O A C\ 




GGGTAGCTGG 


TCTGAGAGGA 


CGATCAGCCA 


CACTGGCACT 


GCGACACGGG 






25 


ACGGGAGGCA 


GCAGTAAGGA 


ATATTGCGCA 


ATGGGCGAAA 


GCCTGACGCA 


w v_ im n. v» w I— V»» Vj 




GTGGGGGATG 


AAGGTCTTCG 


GATTGTAAAC 


CCCTTTCGGG 


AGGGAAGATG 


GAGPGAGra A 


>i o n 




TCGTTCGGAC 


GGTACCTCCA 


GAAGCAGCCA 


CGGCCAACTT 


CGTGCCAGPA 


Gcrrzccin'TTL a. 


a o r\ 
4 8 0 


30 


TACGAAGGTG 


GCAAGCGTTG 


CTTGGATTCA 


CTGGGCGTAC 


AGGGTGTGTA 


rzGr*rzrz r P r r r rr*r* 

X X X VjrVj 


54 0 




TAAGC CTTCT 


GTTAAAGCTT 


CGGGCCCAAC 


CCGAAAAGCG 


C A G A GGGT Zi C 


1 WjUUAvjCjCTA 


600 


35 


GAGGGTGGGA 


GAGGAGCGCG 


GAATTCCCGG 


TGTAG CGG TG 


AAA TCZHfZT'h. f2 


A 7s> r~* t\ /"t 

AbAl LGGGAG 


660 


bAAbt? C CGGT 


GGCGAAGGCG 


GCGCT CTGGA 


ACATACCTGA 


CGCTGAGACA 


CGAAAACGTG 


720 




GGGAGCAAAC 


AGGATTAGAT 


ACCCTGGTAG 


TCCACGCCCT 


AAACTATGGA 


TACTAAGTGT 


780 


40 


CGGCG GGTTA 


CCGCCGGTGC 


CGCAGCTAAC 


GCATTAAGTA 


TCCCGCCTGG 


GAGGTACGGC 


840 




CGCAAGGTTG 


AAACTCAAAG 


GAATTGACGG 


GGGCCCGCAC 


AAGCGGTGGA 


GCTTGTGGTT 


900 


45 


TAATTCGACG 


CAACGCGAAG 


AACCTTACCC 


AGGTTGGACA 


TGCACGTAGT 


AGAAAGGTGA 


960 


AAGCCTGACG 


AGGTAGCAAT 


ACCAGCGTGC 


TCAGGTGCTG 


CATGGCTGTC 


GTCAGCTCGT 


1020 




GCCGTGAGGT 


GTTGGGTTAA 


GTCCCGCAAC 


GAGCGCAACC 


CCTGCTTTCA 


GTTGCTACCG 


1080 


50 


GGTCATGCCG 


AGCACTCTGA 


AAGGACTGCC 


CAGGATAACG 


GGGAGGAAGG 


TGGGGATGAC 


1140 




GTCAAGTCAG 


CATGGCCTTT 


ATGCCTGGGG 


CCACACACGT 


GCTACAATGG 


C CGGT AC AAA 


1200 


55 


GCGCTGCAAA 


CCCGTGAGGG 


GGAGCCAATC 


GCAAAAAACC 


GGCCTCAGTT 


CAGATTGAGG 


1260 




TCTGCAACTC 


GACCTCATGA 


AGGCGGAATC 


GCTAGTAATC 


GCGGATCAGC 


ACGCCGCGGT 


1320' 
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GAATACGTNC CCGGGCCTTG TACACACCGC CCGTCACACC ACGAAAGCCT GTTGTACCTG 13 8 0 

AAGTCGCCCA AGCCAACCGC AAGGAGGCAG GCGCCCACGG TATGGC 14 2 6 

5 (2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 9 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

* 15 (iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 
20 (A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

25 

TAATACATGC AAGTCGAGCG AGAAGGTGTA GCAATACACT TGTAAAGCGG CGAACGGGTG 60 

AGGAATACAT GGGTAATCTA CCATCGAGTG GGGAATAACC AACCGAAAGG TTGGC TAAT A 12 0 

30 CCGCGTACGC CTCCGAGTCT TCGGGTTCGG AGGGAAAGCT GCACTGTGAG TGTAGCGCTC 18 0 

TTTGATGAGC TCATGTCCTA TCAGCTTGTT GGTAGGGTAA CGGCCTACCA AGGCTTTGAC 24 0 

GGGTAGCTGG TCTGAGAGGA CGATCAGCCA CACTGGCACT GCGACACGGG CCAGACTCCT 3 00 

35 

ACGGGAGGCA GCAGTAAGGA ATATTGCG CA ATGGGCGAAA GCCTGACGCA GCNACGCCGC 3 60 

GTGGGGGATG AAGGTCTTCG GATTGTAAAC CCCTTTCGGG AGGGAAGATG GAGCGAGCAA 42 0 

40 TCGTTCGGAC GGTACCTCCA GAAGCAGCCA CGGCCAACTT CGTGCCAGCA GCCGCGGTAA 480 

TACGAAGGTG GCAAGCGTTG TTCGGATTCA CTGGGCGTAC AGGGTGTGTA GGCGGTTTGG 54 0 

TAAGCCTTCT GTTAAAGCTT CGGGCCCAAC CCGGAAAGCG C AGGGGG T AC TGCCAGGCTA 60 0 

45 

GAGGGTGGGA GAGGAGCGCG GAATTCCCGG TGTAGCGGTG AAATGCGTAG AGATCGGGAG 66 0 

GAAGGCCGGT GGCGAAGGCG GCGCTCTGGA ACATACCTGA CGCTGAGACA CGAAAGCGTG 72 0 

50 GGGAGCAAAC AGGATTAGAT ACCCTGGTAG TCCACGCCCT AAGCTATGGA TACTAAGTGT 780 

CGGCGGGTTA CCGCCGGTGC CGCAGCCAAC GCGTTAAGTA TCCCGCCTGG GAAGTACGGC 84 0 

CGCAAGGTTG AAACTCAAAG GAATTGACGG GGGCCCGCAC AAGCGGTGGA GCATGTGGTT 90 0 

55 

TAATTCGACG CAACGCGAAG AACCTTACCC AGGTTGGACA TGCACGTAGT AGAAAGGTGA 96 0. 
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AAGCCTGACG AGGTAGCAAT ACCAGCGTGC TCAGGTGCTG CATGGCTGTC GTCAGCTCGT 102 0 

GCCGTGAGGT GTTGGGTTAA GTCCCGCAAC GAGCGCAACC CCTGCTTTCA GTTGCTACCG 108 0 

GGTCATGCCG AGCACTCTGA AAGGACTGCC CAGGATAACG GGGGAGGAAG GTGGGGATGA 114 0 

CGTCAAGTCA GCATGGCCTT TATGCCTGGG GCCACACACG TGCTACAATG GCCGGTACAA 12 0 0 

AACGCTGCAA ACCCGTGAGG GGGAGCCAAT CGCAAAAAAC CGGCCTCAGT TCAGATTGAG 12 60 

GTCTGCAACT CGACCT CATG AAGGCGGAAT CGCTAGTAAT CGCGGATCAG CACGCCGCGG 132 0 

TGAATACGTN CCCGGGCCTT GTGCACACCG CCCGTCACAC CACGAAAGCC TGTTGTACCT 13 80 

15 GAAGTCGCCC AAGCCAACCG CAAGGAGGCA GGCGCCCACG GTATGGCCG 

(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS : 
20 (A) LENGTH: 1415 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE : NO 



30 



35 



40 



45 



50 



55 



1429 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGAGAAGGTG TAGCAATACA CTTGTAAAGC GGCGAACGGG TGAGGAATAC ATGGGTAATC 
TACCATCGAG TGGGGAATAA CCAACCGAAA GGTTGGCTAA TACCGCGTAC GCCTCCGAGT 
CTTCGGGTTC GGAGGGAAAG CTGCACTGTG AGTGTAGCGC TCTTTGATGA GCTCATGTCC 18 0 

TATCAGCTTG TTGGTAGGGT AACGGCCTAC CAAGGCTTTG ACGGGTAGCT GGTCTGAGAG 
GACGATCAGC CACACTGGCA CTGCGACACG GGCCAGACTC CTACGGGAGG CAGCAGTAAG 
GAATATTGCG CAATGGGCGA AAGCCTGACG CAGCNACGCC GCGTGGGGGA TGAAGGTCTT 
CGGATTGTAA ACCCCTTTCG GGAGGGAAGA TGGAGCGAGC AATCGTTCGG ACGGTACCTC 420 
CAGAAGCAGC CACGGCCAAC TTCGTGCCAG CAGCCGCGGT AATACGAAGG TGGCAAGCGT 480 
TGTTCGGATT CACTGGGCGT ACAGGGTGTG TAGGCGGTTT GGTAAGCCTT CTGTTAAAGC 54 0 

TTCGGGCCCA ACCCGGAAAG CGCAGAGGGT ACTGCCAGGC TAGAGGGTGG GAGAGGAGCG 600 



60 
120 



240 
300 
360 
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CGGAATTCCC GGTGTAGCGG TGAAATGCGT AGAGATCGGG AGGAAGGCCG GTGGCGAAGG 660 

CGGCGCTCTG GAACATACCT GACGCTGAGA CACGAAAGCG TGGGGAGCAA ACAGGATTAG 72 0 

ATACCCTGGT AGTCCACGCC CTAAACTATG GATACTAAGT GTCGGCGGGT TACCGCCGGT 780 

GCCGCAGCTA ACGCATTAAG TATCCCGCCT GGGAAGTACG GCCGCAAGGT TGAAACTCAA 84 0 

AGGAATTGAC GGGGGCCCGC ACAAGCGGTG GAGCATGTGG TTTAATTCGA CGCAACGCGA 90 0 

AGAACCTTAC CCAGGTTGGA CATGCACGTA GTAGAAAGGT GAAAGCCTGA CGAGGTAG C A 96 0 

ATACCAGCGT GCTCAGGTGC TGCATGGCTG TCGTCAGCTC GTGCCGTGAG GTGTTGGGTT 102 0 

15 AAGTC CCGCA ACGAGCGCAA CCCCTGCTTT CAGTTGCTAC CGGGTCATGC CGAGCACTCT 108 0 

GAAAGGACTG CCCAGGATAA CGGGGAGGAA GGTGGGGATG ACGTCAAGT C AGCATGGCCT 114 0 

TTATG C CTGG GGCCACACAC GTGCTACAAT GGCCGGTATA AAACGCTGCA AACCCGTGAG 12 0 0 

20 

GGGGAGCCAA TCGCAAAAAA CCGGCCTCAG TTCAGATTGA GGTCTGCAAC TCGACCTCAT 12 6 0 

GAAGGCGGAA TCGCTAGTAA TCGCGGATCA GCACGCCGCG GTGAATACGT NCCCGGGCCT 132 0 

25 TGTACACACC GCCCGTCACA CCACGAAAGC CTGTTGTACC TGAAGTCGCC CAAGCCAACC 13 8 0 

GCAAGGAGGC AGGCGCCCAC GGTATGGCCG GTGAT 1415 
(2) INFORMATION FOR SEQ ID NO: 10: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1435 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



40 



45 



55 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

50 CCTAATACAT GCAAGTCGAT CGAGAAGGTG TAGCAATACA CTTGTAAAGC GGCGAACGGG 6 0 

TGAGGAATAC ATGGGTAATC TACCATCGAG TGGGGAATAA CCAACCGAAA GGTTGGCTAA 120 

TACCG CGTAC GCCTCCGAGT CTTCGGGTTC GGAGGGAAAG CTGCACTGTG AGTGTAGCGC 180 

TCTTTGATGA GCTCATGTCC TATCAGCTTG TTGGTAGGGT AACGGCCTAC CAAGGCTTTG 24 0 
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34 

ACGGGTAGCT GGTCTGAGAG GACGATCAGC CACACTGG C A CTGCGACACG GGCCAGACTC 300 

CTACGGGAGG CAGCAGTAAG GAATATTGCG CAATGGGCGA AAGCCTGACG CAGCCACGCC 3 60 

GCGTGGGGGA TGAAGGTCTT CGGATTGTAA ACCCCTTTCG GGAGGGAAGA TGGAGCGAGC 420 

AATCGTTCGG ACGGTACCTC CAGAAGCAGC CACGGCCAAC TTCGTGCCAG CAGCCGCGGT 480 

AATACGAAGG TGGCAAGCGT TGTTCGGATT CACTGGGCGT ACAGGGTGTG TAGGCGGTTT 54 0 

GGTAAGCCTT CTGTTAAAGC TTCGGGCCCA ACCCGGAAAG CGCAGAGGGT ACTGCCAGGC 6 00 

TAGAGGGTGG GAGAGGAGCG CGGAATTCCC GGTGTAGCGG TGAAATGCGT AGAGATCGGG 660 

15 AGGAAGGCCG GTGGCGAAGG CGGCGCTCTG GAACATACCT GACGCTGAGA CACGAAAGCG 72 0 

TGGGGAGCAA ACAGGATTAG ATACCCTGGT AGTCCACGCC CTAAACTATG GATACTAAGT 780 

GTCGGCGGGT TACCGCCGGT GCCGCAGCTA ACGCATTAAG TATCCCGCCT GGGAAGTACG 84 0 

20 

GCCGCAAGGT TGAAACTCAA AGGAATTGAC GGGGGCCCGC ACAAGCGGTG GAG CATGTGG 900 

TTTAATTCGA CGCAACGCGA AGAAC CTTAC CCAGGTTGGA CATGCACGTA GTAGAAAGGT 96 0 

25 GAAAG CCTGA CGAGGTAGCA ATACCAG CGT GCTCAGGTGC TGCATGGCTG TCGTCAGCTC 102 0 

GTGCCGTGAG GTGTTGGGTT AAGTCCCGCA ACGAG CGCAA CCCCTGCTTT CAGTTG CTAC 10 80 

CGGGTCATGC CGAGCACTCT GAAAGGACTG CCCAGGATAA CGGGGAAGGA AGGTGGGGAT 114 0 

30 

GACGTCAAGT CAGCATGGCC TTTATGCCTG GGGCCACACA CGTGCTACAA TGGCCGGTAC 12 0 0 

AAAACGCTGC AAACCCGTGA GGGGGAGCCA ATCGCAAAAA ACCGGCCTCA GTTCAGATTG 12 60 

35 AGGTCTGCAA CTCGACCTCA TGAAGGCGGA ATCGCTAGTA ATCGCGGATC AGCACGCCGC 132 0 

GGTGAATACG TNCCCGGGCC TTGTACACAC CGCCCGTCAC ACCACGAAAG CCTGTTGTAC 13 8 0 

CTGAAGTCGC CCAAGCCAAC CGCAAGAAGG CAGGCGCCCA CGGTATGGCC GGTGA 14 3 5 
(2) INFORMATION FOR SEQ ID NO: 11: 



40 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1437 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



50 



( i i ) MOLECULE TYPE : DNA ( genomi c ) 

(iii) HYPOTHETICAL: NO 

(iv) ANT I - SENSE : NO 

55 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

5 AATACATGCA AGTCGATCGA GAAGGTGTAG CAATACACTT GTAAAGCGGC GAACGGGTGA 6 0 

GGAATACATG GGTAATCTAC CATCGAGTGG GGAATAACCA AC CGAAAGGT TGGCTAATAC 12 0 

CGCGTACGCC TCCGAGTCTT CGGGTTCGGA GGGAAAGCTG CACTGTGAGT GTAGCGCTCT 18 0 

10 

TTGATGAGCT CATGTCCTAT CAGCTTGTTG GTAGGGTAAC GGCCTACCAA GGCTTTGACG 24 0 

GGTAGCTGGT CTGAGAGGAC GATCAGCCAC ACTGGCACTG CGACACGGGC CAGACTCCTA 3 00 

15 CGGGAGGCAG CAGTAAGGAA TATTGCGCAA TGGGCGAAAG CCTGACGCAG CCACGCCGCG 3 60 

TGGGGGATGA AGGTCTTCGG ATTGTAAACC CCTTTCGGGA GGGAAGATGG AGCGAGCAAT 42 0 

CGTTCGGACG GTACCTCCAG AAGCAGCCAC GGCCAACTTC GTGCCAGCAG CCGCGGTAAT 4 80 

20 

ACGAAGGTGG CAAGCGTTGT TCGGATTCAC TGGGCGTACA GGGTGTGTAG GCGGTTTGGT 54 0 

AAGCCTTCTG TTAAAGCTTC GGGCCCAACC CGGAAAGCGC AGAGGGTACT GCCAGGCTAG 60 0 

25 AGGGTGGGAG AGGAGCGCGG AATTCCCGGT GTAGCGGTGA AATGCGTAGA GATCGGGAGG 66 0 

AAGGCCGGTG GCGAAGGCGG CGCTCTGGAA CATACCTGAC GCTGAGACAC GAAAGCGTGG 72 0 

GGAGCAAACA GGATTAGATA CCCTGGTAGT CCACGCCCTA AACTATGGAT ACTAAGTGTC 7 80 

30 

GGCGGGTTAC CGCCGGTGCC GCAGCTAACG CATTAAGTAT CCCGCCTGGG AAGTACGGCC 84 0 

GCAAGGTTGA AACT C AAAGG AATTGACGGG GGCCCGCACA AGCGGTGGAG CATGTGGTTT 90 0 

35 AATTCGACGC AACGCGAAGA ACCTTACCCA GGTTGGACAT GCACGTAGTA NAAAGGTGAA 960 

AGCCTGACGA GG TAG C AATA CCAGCGTGCT CAGGTGCTGC ATGGCTGTCT TCAGCTCGTG 102 0 

CCGTGAGGTG TTGGGTTAAG TCCCGCAACG AGCGCAACCC CTGCTTTCAG TTGCTACCGG 10 8 0 

40 

GTCATGC CG A ACACTCTGAA AGGACTGCCC AGGATAACGG GGAAGGAAGG TGGGGATGAC 114 0 

GTCAAGTCAG CATGGCCTTT ATGCCTGGGG CCACACACGT GCTACAATGG CCGGTACAAA 12 0 0 

45 GCGCTGCAAA CCCGTGAGGG GGAGCCAATC GCAAAAAACC GGCCTCAGTT CAGATTGAGG 126 0 

TCTGCAACTC GACCTCATGA AGGCGGAATC GCTAGTAATC GCGGATCAGC ACGCCGCGGT 132 0 

GAATACGTNC CCGGGCCTTG TACACACCGC CCGTCACACC ACGAAAG CCT GTTGTACCTG 13 8 0 

50 

AAGTCGC CCA AGCCAACCGC AAGGAGGCAG GCGCCCACGG TATGGCCGGT GATGGGG 14 3 7 
(2) INFORMATION FOR SEQ ID NO: 12: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1437 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI - SENSE : NO 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



'5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 





AATACATGCA 


AGTCGATCGA 


NAAGGTGTAG 


CAATACACTT 


GTAAAGCGGC 


GAACGGGTGA 


60 


20 


GGAATACATG 


GGTAAT C T AC 


CATCGAGTGG 


GGAATAACCA 


ACCGAAAGGT 


TGGCTAATAC 


120 


CGCGTACGCC 


TCCGAGTCTT 


CGGGTTCGGA 


GGGAAAGCTG 


CACTGTGAGT 


GTAGCGCTCT 


180 




TTGATGAGCT 


CATGTCCTAT 


CAGCTTGTTG 


GTAGGGTAAC 


GGCCTACCAA 


GGCTTTGACG 


240 


25 


GGTATCTGGT 


CTGAGAGGAC 


GATCAGCCAC 


ACTGGCACTG 


CGACACGGGC 


CAGACTCCTA 


300 




CGGGAGGCAG 


CAGTAAGGAA 


TATTGCGCAA 


TGGGCGAAAC 


CCNGACGCAG 


CCACGCCGCG 


360 


30 


TGGGGGATGA 


AGGTCTTCGG 


ATTGTAAACC 


CCTTTCGGGA 


GGGAAGATGG 


AACGAGCAAT 


420 


CGTTCGGACG 


GTACCTCCAG 


AAGCAGCCAC 


GGCCAACTTC 


GTGCCAGCAG 


CCGCGGTAAT 


480 




ACGAAGGTGG 


CAAGCGTTGT 


TCGGATTCAC 


TGGGCGTACA 


GGGTGTGTAG 


GCGGTTTGGT 


540 


35 


AAGCCTTCTG 


TTAAAGCTTC 


GGGCCCAACC 


CGGAAAGCGC 


AGAGGGTACT 


GCCAGGCTAG 


600 




AGGGTGGGAG 


AGGAGCGCGG 


AATTCCCGGT 


GTAG CGGTGA 


AATGCGTAGA 


GATCGGGAGG 


660 


40 


AAGGCCGGTG 


GCGAAGGCGG 


CGCTCTGGAA 


CATACCTGAC 


GCTGAGACAC 


GAAAGCGTGG 


720 




GGNGCAAACA 


GGATTAGATA 


CCCTGGTAGT 


CCACGCCCTA 


AACTATGGAT 


ACTAAGTGTC 


780 




GGCGGGTTAC 


CGCCGGTGCC 


GCAGCTAACG 


CATTAAGTAT 


CCCGCCTGGG 


AAGTACGGCC 


840 


45 


GCAAGGTTGA 


AACTCAAAGG 


GATTGACGGG 


GGCCCGCACA 


AGCGGTGGGG 


CATGTGGTTT 


900 




AATTCGACGC 


AACGCGAAGA 


ACCTTACCCA 


GGTTGGACAT 


GCACGTAGTN 


GAAAGGTGAA 


960 


50 


AGCCTGACGA 


GGTAG CAATA 


CCAGCGTGCT 


CAGGTGCTGC 


ATGGCTGTCG 


TCAGCTCGTG 


1020 




CCGTGAGGTG 


TTGGGTTAAG 


TCCCGCAACG 


AGCGCAACCC 


CTGCTTTCAG 


TTGCTACCGG 


1080 




GTCATGCCGA 


ACACTCTGAA 


AGGACTGCCC 


AGGATAACGG 


GGAAGGAAGG 


TGGGGATGAC 


1140 


55 


GTCAAGTCAG 


CATGGCCTTT 


ATACCTGGGG 


CCACACACGT 


GCTACAATGG 


C CGGTAC AAA 


1200 




ACGCTGCAAA 


CCCGTGAGGG 


GGAGCCAATC 


GCAAAAAACC 


GGCCTCAGTT 


CAGATTGAGG 


1260 



BNSOOCID: <CA 2252064A1 J_> 



CA 02252064 1998- I 1 -20 



37 

TCTGCAACTC GACCTCATGA ATGCGGAATC GCTAGTAATC GCGGATCAGC ACGCCGCGGT 1320 
GAATACGTNC CCGGG CCTTG TACACACCGC CCGTCACACC ACGAAAGCCT GTTGTACCTG 13 8 0 

5 

AAGTCGCCCA AGCCAACCGC AAGGAGGCAG GCGCCCACGG TATGGCCGGT GATGGGG 14 3 7 

(2) INFORMATION FOR SEQ ID NO: 13: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

20 (iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TAATACATGC AAGTCGATCG ANAAGGTGTA GCAATACACT TGTAAAGCGG CGAACGGGTG 60 

30 

AGGAATACAT GGGTAATCTA CCATCGAGTG GGGAATAACC AACCGAAAGG TTGGCTAATA 120 

CCGCGTACGC TTCCGAGTCT TCGGGCTTGG AAGGAAAGCC GCACTGTGAG TGCGGCGCTC 18 0 

35 TTTGATGAGC TCATATCCTA TCANCTTGTT GGTAGGGTAA CGGCCTACCA AGGCTTTGAC 24 0 

GGGTATCTGG TCTGAGAGGA CGATCAGCCA CACTGGCACT GCGACACGGG CCAGACTCCT 30 0 

ACGGGAGGCA GCAGTAAGGA ATATTGCGCA ATGGGCGAAA CCCNGACGCA GCCACGCCGC 3 60 

40 

GTGGGGGATG AAGGTCTTCG GATTGTAAAC CCCTTTCGGG AGGGAAGATG GAACGAGCAA 42 0 

TCGTTCGGAC GGTACCTCCA GAAGCAGCCA CGGCCAACTT CGTGCCAGCA GCCGCGGTAA 48 0 

45 TACGAAGGTG GCAAGCGTTG TTCGGATTCA CTGGGCGTAC AGGGTGTGTA GGCGGTTTGG 54 0 

TAAGCCTTCT GTTAAAGCTT CGGGCCCAAC CCGGAAAGCG CAGAGGGTAC TGCCAGGCTA 600 

GAGGGTGGGA GAGGAGCGCG GAATTCCCGG TGTAGCGGTG AAATGCGTAG AGATCGGGAG 660 

50 

GAAGGCCGGT GGCGAAGGCG GCGCTCTGGA ACATACCTGA CGCTCAGACA CGAAAGCGTG 72 0 

GGGAGCAAAC AGGATTAGAT ACCCTGGTAG TCCACGCCCT AAACTATGGA TACTAAGTGT 7 80 

55 CGGCGGGTTA CCGCCGGTGC CGCAGCTAAC GCATTAAGTA TCCCGCCTGG GAAGTACGGC 840 

CGCAAGGTTG AAACTCAAAG GAATTGACGG GGGCCCGCAC AAGCGGTGGA GCATGTGGTT 900 
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TAATTCGACG CAACGCGAAG AACCTTACCC AGGTTGGACA TGCACGTAGT AGAAAGGTGA 96 0 

AAGCCTGACG AGGTAGCAAT ACCAGCGTGC TCAGGTGCTG CATGGCTGTC GTCAGCTCGT 102 0 

GCCGTGAGGT GTTGGGTTAA GTCCCGCAAC GAGCGCAACC CCTGCTTTCA GTTGCTGCCG 1080 

GGTCATGCCG AACACTCTGA AAGGACTGCC CAGGATAACG GGGAAGGAAG GTGGGGATGA 114 0 

CGTCAAGTCA GCATGGCCTT TATGCCTGGG GCCACACACG TGCTACAATG GCCGGTACAA 12 0 0 

AACGCTGCAA ACCCGTGAGG GGGAGCCAAT CGCAAAAAAC CGGCCTCAGT TCANATTGAG 1260 

GTCTGCAACT CGACCTCATG AATGCGGAAT CGCTAGTAAT CGCGGATCAG CACGCCGCGG 132 0 

TGAATACGTN CCCGGGCCTT GTACACGCCG CCCGTCACAC CACGAAAGCC TGTTGTACCT 13 8 0 

GAAGTCGCCC AAGCCAACCG CAAGGAGGCA NGCGCCCACG GTATGGCCGG TGATG 14 3 5 
20 (2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



15 



30 



35 



45 



<ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
40 CGGGAGGGAA GATGGAGC 

(2) INFORMATION FOR SEQ ID NO : 15: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 
55 (iv) ANTI -SENSE: NO 



18 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

5 CCAACCCGGA AAGCGCAGAG 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

20 (iv) ANTI - SENSE : NO 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

AGCCTGGCAG TACCCTCT 

(2) INFORMATION FOR SEQ ID NO: 17: 

30 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

< C ) STRANDEDNESS : s ingle 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 



40 



45 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrococcus mobilis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

50 CAGCCGGGAG GAAAAGCA 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 (iii) HYPOTHETICAL: NO 

(iv) ANT I - SENSE : NO 

(vi) ORIGINAL SOURCE: 
10 (A) ORGANISM: Magne tobacterium bavaricum 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
TGTAGGGAAA GATGATGA 

(2) INFORMATION FOR SEQ ID NO: 19: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL : NO 

30 (iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter hamburgensis 



35 



40 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TGTGCGGGAA GATAATGA 

(2) INFORMATION FOR SEQ ID NO : 20: 



(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 18 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

55 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospina gracilis 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 20: 

5 CGGGTGGGAA GAACAAAA 18 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 



20 



50 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira marina 



25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CATGAGGAAA GATAAAGT 18 
30 (2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
35 (C> STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

40 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 
45 (A) ORGANISM: Nitrospira 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CGGCAGGGAA GATGGAAC 18 
(2) INFORMATION FOR SEQ ID NO : 23: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CGGGAGGGAA GATGGAGC 18 
(2) INFORMATION FOR SEQ ID NO: 24: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 



30 



35 



55 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

40 CCGCAGGGAA GATGGAAC 18 

(2) INFORMATION FOR SEQ ID NO : 25: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

5 

CGGGAGGGAA GATGGAAC 18 
(2) INFORMATION FOR SEQ ID NO : 26: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

20 (iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter 



25 



30 



40 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CGTGCGGGAA GATAATGA 18 
(2) INFORMATION FOR SEQ ID NO: 27: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

45 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

CGGCAGGGAA GATGGAAC 18 
(2) INFORMATION FOR SEQ ID NO : 28: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
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40 



44 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira moscoviensis 



15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CGGGAGGGAA GATGGACG 18 
20 (2) INFORMATION FOR SEQ ID NO : 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

30 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 
35 (A) ORGANISM: Nitrococcus mobilis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
TCAACCTGGG AATTGCATCC 2 0 

(2) INFORMATION FOR SEQ ID NO : 30: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
55 (iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 
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20 



25 



45 



50 



45 

(A) ORGANISM: Magnet obacterium bavaricum 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TCAAC CCGGG AATTGCCTTG 
(2) INFORMATION FOR SEQ ID NO : 31: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter hamburgensis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

30 TCAACTCCAG AACTGCCTTT 

(2) INFORMATION FOR SEQ ID NO : 32: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospina gracilis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 32 
TCAAC CGTGG AATTG CGTTT 
55 (2) INFORMATION FOR SEQ ID NO: 33: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2 0 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

10 (iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Nitrospina marina 



15 



20 



30 



40 



45 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
TTAACCGGGA AAGGTCGAGA 2Q 
(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

35 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
CTAACCCGGA AAGTGCGGAG 2Q 
(2) INFORMATION FOR SEQ ID NO : 35: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
50 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CCAACCCGAA AAGCGCAGAG 2 0 

10 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 (iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 
25 (A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CCAACCCGGA AAGCGCAGAG 2 0 

(2) INFORMATION FOR SEQ ID NO : 37: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

45 (iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
TCAACTCCAG AACTGCCTTT 2 0 

55 

(2) INFORMATION FOR SEQ ID NO: 38: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira moscoviensis 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

20 CCAACCCGGA AAGCGCAGAG 

(2) INFORMATION FOR SEQ ID NO : 39: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrococcus mobilis 



40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 39: 
AGCCAAACAG TATCGGAT 
45 (2) INFORMATION FOR SEQ ID NO : 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
55 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Magnetobacterium bavaricum 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 40: 
AGTTAAACAG TTTTCAAG 

(2) INFORMATION FOR SEQ ID NO : 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter hamburgensis 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
AGACCTTCAG TATCAAAG 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear. 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospina gracilis 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
AGCCGAATAG TTTCAAAC 

(2) INFORMATION FOR SEQ ID NO: 43: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 
15 (A) ORGANISM: Nitrospina marina 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

20 

AG CTGAATAG TTCCTCTC 18 
(2) INFORMATION FOR SEQ ID NO: 44: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

35 (iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



40 



45 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
AGCCGAGCAG TCCCCTCC 18 
(2) INFORMATION FOR SEQ ID NO : 45: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 45: 

10 AGCCTGGCAG TACCCTCT 18 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Nitrospira 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
AGCCTGGCAG TACCCCCT 18 
35 (2) INFORMATION FOR SEQ ID NO : 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

45 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 
50 (A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
AGCCTGGCAG TACCGTCT 18 
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(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter 



10 



20 



25 



35 



40 



50 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
AGATCCTCAG TATCAAAG 

(2) INFORMATION FOR SEQ ID NO: 49: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira moscoviensis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 49: 
45 AGCCTGGCAG TACCCTCT 

(2) INFORMATION FOR SEQ ID NO: 50: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Oligonucleotide primer" 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 50: 

10 CCTGTGCTCC ATGCTCCG 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



25 



55 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrobacter hamburgensis 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CCTGTGCTCC ATGCTCCG 
35 (2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

45 (iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 
50 (A) ORGANISM: Nitrospina gracilis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
CCTGTGCAAG GGCCCCGA 
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(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: NO 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrococcus mobilis 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
CCTGTCATCC GGTTCCCG 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira moscoviensis 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
45 CCTGAGCACG CTGGTATT 

(2) INFORMATION FOR SEQ ID NO : 55: 

(i) SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

55 < i:L > MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
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(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospina marina 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 55: 
CCTGAGCTCG CTCCCCTT 

(2) INFORMATION FOR SEQ ID NO: 56: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

20 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

25 (iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Magnet obacterium bavaricum 



30 



35 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
CCTGTGCAAG CTCTCCCT 

(2) INFORMATION FOR SEQ ID NO: 57: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



<ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI - SENSE : NO 

50 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

CCTGAGCAGG ATGGTATT 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



15 



20 



45 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
CCTGAGCACG CTGGTATT 18 
25 (2) INFORMATION FOR SEQ ID NO : 59: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

35 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 
40 (A) ORGANISM: Nitrospira 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
CCTGAGCAGG ATGGTGTT 18 
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CLAIMS 

1 . A consortium of microorganisms capable of nitrite oxidation in wastewater, which consortium 
is enriched in members of the Nitrospira phylum. 

2. An oligonucleotide primer for PCR amplification of Nitrospira DNA, said primer comprising 
5 at least 12 nucleotides having a sequence selected from the group consisting of: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; and 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
ID NO: 13. 

3. The oligonucleotide primer of claim 2, wherein said primer has a length of 12 to 50 
10 nucleotides. 

4. The oligonucleotide primer of claim 2, wherein said primer has a length of 12 to 22 
nucleotides. 

5. The oligonucleotide primer of claim 2, wherein said primer sequence is selected from the 
group consisting of SEQ ID NO: 14, SEQ ID NO: 15 and SEQ ID NO:16. 

15 6. A primer pair for PCR amplification of Nitrospira DNA, said primer pair comprising: 

(a) a first oligonucleotide of at least 12 nucleotides having a sequence selected from one 
strand of a bacterial 16S rDNA gene; and 

(b) a second oligonucleotide of at least 12 nucleotides having a sequence selected from the 
other strand of said 16S rDNA gene downstream of said first oligonucleotide sequence; wherein at 

20 least one of said first and second oligonucleotides is selected from the group consisting of: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; and 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
ID NO: 13. 

7. The primer pair of claim 6, wherein said first and second oligonucleotide primers 
25 independently have lengths of 12 to 50 nucleotides. 

8. The primer pair of claim 6, wherein said first and second oligonucleotide primers 
independently have lengths of 12 to 22 nucleotides. 

9. The primer pair of claim 6, wherein said first oligonucleotide primer sequence is selected 
from the group consisting of SEQ ID NO: 14 and SEQ ID NO: 15, and said second oligonucleotide 

30 primer sequence is SEQ ID NO: 16. 

10. A probe for detecting Nitrospira DNA, said probe comprising at least 12 nucleotides having a 
sequence selected from the group consisting of: 

(i) any one of SEQ ID NO: 1 to SEQ ID NO: 13; and 

(ii) a DNA sequence having at least 92% identity with any one of SEQ ID NO: 1 to SEQ 
35 ID NO: 13. 
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1 1 . The probe of claim 10, wherein said probe has a length of 1 5 to 50 nucleotides. 

12. The probe of claim 10, wherein said probe has a length of 15 to 22 nucleotides. 

13. A kit comprising: 

at least one primer according to claim 2; 
5 at least one primer pair according to claim 6; or 

at least one probe according to claim 10. 

14. The kit of claim 13, wherein said kit further includes reagents selected from the group 
consisting of buffers, salts, detergents, nucleotides and thermostable polymerase. 

15. A method of detecting a Nitrospira species in a sample, said method comprising the steps of: 
1 0 (a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a primer pair according to 

claim 6; 

(c) amplifying Nitrospira DNA by cyclically reacting said primer pair with said DNA to 
produce an amplification product; and 

15 (d) detecting said amplification product. 

16. The method according to claim 15, wherein said amplification product has a length of 50 to 
1 ,400 bps. 

1 7. A method of quantitating the level of a Nitrospira species in a sample, said method comprising 
the steps of: 

20 (a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a primer pair according to 

claim 6; 

(c) amplifying Nitrospira DNA by cyclically reacting said primer pair with said DNA to 
produce an amplification product; and 

25 (d) detecting said amplification product and quantitating the level of said product by 

comparison with at least one reference standard. 

18. The method according to claim 17, wherein said amplification product has a length of 50 to 
1,400 bps. 

19. A method of detecting a Nitrospira species in a sample, said method comprising the steps of: 
30 (a) lysing cells in said sample to release genomic DNA; 

(b) contacting denatured genomic DNA from step (a) with a labelled probe according to 
claim 4 under conditions which allow hybridisation of said genomic DNA said probe; 

(c) separating hybridised labeled probe and genomic DNA from unhybridised labeled 
probe; and 

35 (d) detecting said labeled probe-genomic DNA hybrid. 
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20. A method of detecting cells of a Nitrospira species in a sample, said method comprising the 
steps of: 

(a) treating cells in said sample to fix cellular contents; 

(b) contacting said fixed cells from step (a) with a labeled probe according to claim 10 
5 under conditions which allow said probe to hybridise with RNA within said fixed cell; 

(c) removing unhybridised probe from said fixed cells; and 

(d) detecting said labeled probe-RNA hybrid. 
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TA 






SBR2 0 1 6 GCAAGTCGAG 


CGAGAAGGTG 


TA 






RC7 GCAAGTCGAG 


CGAGAAGGTG 


TA • 






RC14 


CGAGAAGGTG 


T A 






RC99 GCAAGTCGAT 


CGAGAAGGTG 


TA 






RC11 GCAAGTCGAT 


CGAGAAGGTG 


TA 






RC7 3 GCAAGTCGAT 


CGANAAGGTG 


TA 






RC90 GCAAGTCGAT 


CGANAAGGTG 


TA 







] 
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[ 


151 








200 


SBR1 02 4 CGTTTGTAAA 


GCGGC 


. . GAACGGGT 


GAGGAATACA 


TGGGTAACCT 


SBR1 0 1 5 CGTTTGTAAA 


GCGGC 


. .GAACGGGT 


GAGGAATACA 


TGGGTAGCCT 


GC86 


CGTTTGTAAA 


GCGGC 


. .GAACGGGT 


GAGGAATACA 


TGGGTAACCT 


SBR2 0 4 6 CGTTTGTAAA 


GCGGC 


GAACGGGT 


GAGGAATACA 


TGGGTAACCT 


RC2 5 


CGTTTGTAAA 


GCGGC 


. .GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC19 


CACTTGTAAA 


GCGGC 


, .GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


SBR2 016 CACTTGTAAA 


GCGGC 


. GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC7 


CACTTGTAAA 


GCGGC 


. .GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC14 


CACTTGTAAA 


GCGGC 


. .GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC99 


CACTTGTAAA 


GCGGC 


. .GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC11 


CACTTGTAAA 


GCGGC. . . . 


GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC73 


CACTTGTAAA 


GCGGC 


. . GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


RC9 0 


CACTTGTAAA 


GCGGC 


. .GAACGGGT 


GAGGAATACA 


TGGGTAATCT 


t 


201 








250 


S BR 1024 ACCTTCGAGT 


GGGGAATAAC 


TAGCCGAAAG 


GTTAGCTAAT 


ACCGCATACG 


SBR1 0 15 ACCCTCGAGT 


GGGGAATAAC 


TAACCGAAAG 


GTTAGCTAAT 


ACCG C ATACG 


GC86 


ACCCTCGAGT 


GGGGAATAAC 


TAGCCGAAAG 


GTTAGCTAAT 


ACCGCATACG 


SBR2 0 4 6 ACCCTCGAGT 


GGGGAATAAC 


TAACCGAAAG 


GTTAGCTAAT 


ACCGCATACG 


RC2 5 


ACCATCGAGT 


GGGGAATAAC 


CAAC CGAAAG 


GTTGGCTAAT 


ACCGCGTACG 


RC19 


ACCATCGAGT 


GGGGAATAAC 


CAGC CG AAAG 


GTTGGCTAAT 


ACCGCGTACG 


SBR2 0 1 6 ACCATCGAGT 


GGGGAATAAC 


CAAC CGAAAG 


GTTGGCTAAT 


ACCGCGTACG 


RC7 


ACCATCGAGT 


GGGGAATAAC 


CAACCGAAAG 


GTTGGCTAAT 


ACCGCGTACG 


RC14 


ACCATCGAGT 


GGGGAATAAC 


CAAC CGAAAG 


GTTGGCTAAT 


ACCGCGTACG 


RC9 9 


ACCATCGAGT 


GGGGAATAAC 


CAACCGAAAG 


GTTGGCTAAT 


ACCGCGTACG 


RC11 


ACCATCGAGT 


GGGGAATAAC 


CAACCGAAAG 


GTTGGCTAAT 


ACCGCGTACG 


RC7 3 


ACCATCGAGT 


GGGGAATAAC 


CAACCGAAAG 


GTTGGCTAAT 


ACCGCGTACG 


RC90 


ACCATCGAGT 


GGGGAATAAC 


CAACCGAAAG 


GTTGGCTAAT 


ACCGCGTACG 



[ 251 300 ] 

SBR1 0 2 4 ACTCCTGGTC . TGC . . GGAT CGGGAGAGAA AGCGATACC GTG . 

SBR1 0 1 5GCTCCTGGTC .TGC. GGAT CGGGAGAGAA AGCGATACC GTG. 

GC86 ACTCCTGGTC . TGC . . GGAT CGGGAGAGAA AGCGATACC GTG . 

SBR204 6 GCTCCTGGTC . TGC . . GGAT CGGGAGAGAA AGCGATACC GTG . 

RC2 5 CTTCTGAGTC .TTC..GGGT TCGGAAGGAA AGCCGTACT GTG. 

RC19 CTTC CGAGTC . TTC . . GGGC TTGGAAGGAA AGCCGCACT GTG. 

SBR2016 CTTCTGAGCC . TTC . . GTGT TCGGAAGGAA AGCCGTACT GTG . 

RC7 CCTCCGAGTC .TTC. . GGGT TCGGAGGGAA AGCTGCACT GTG. 

RC14 CCTCCGAGTC .TTC. .GGGT TCGGAGGGAA AGCTGCACT GTG. 

RC9 9 CCTCCGAGTC . TTC . . GGGT TCGGAGGGAA AGCTGCACT GTG . 

RC11 : 'CCTCCGAGTC .TTC. GGGT TCGGAGGGAA AGCTGCACT GTG. 

RC73 CCTCCGAGTC .TTC .GGGT TCGGAGGGAA AGCTGCACT GTG. 

RC90 CTTC CGAGTC . TTC . GGGC TTGGAAGGAA AGCCGCACT GTG. 
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GGTAT CGCGCTCTTG GATGGGCTCA TGTCCTATCA GCTTGTTGGT 
GGTAT CGCGCTCTTG GATGGGCTCA TGTCCTATCA GCTTGTTGGT 
GGTAT CGCGCTCTTG GATGGGCTCA TGTCCTATCA GCTTGTTGGT 
GGTAT CGCGCTCTTG GATGGGCTCA TGTCCTATCA GCTTGTTGGT 
AGTGC GGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 
AGTGC GGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 
AGTGC GGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 
AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 
AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 
AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 
AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 
AGTGT AGCGCTCTTT GATGAGCTCA TGTCCTATCA GCTTGTTGGT 
AGTGC GGCGCTCTTT GATGAGCTCA TATCCTATCA NCTTGTTGGT 

[ 351 400 
SBR1024GAGGTAACGG CTCACCAAGG CTTCGACGGG TAGCTGGTCT GAGAGGACGA 
S B R 1 0 1 5 G AGGTAACGG CTCACCAAGG CTTCGACGGG TAGCTGGTCT GAGAGGACGA 
GC86 GAGGTAACGG CTCACCAAGG CTTCGACGGG TAGCTGGTCT GAGAGGACGA 
SBR2 04 6GAGGTAACGG CTCACCAAGG CTTCGACGGG TAGCTGGTCT GAGAGGACGA 
RC25 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC19 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
S B R2 0 1 6 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC7 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC14 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC99 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC11 AGGGTAACGG CCTACCAAGG CTTTGACGGG TAGCTGGTCT GAGAGGACGA 
RC73 AGGGTAACGG CCTACCAAGG CTTTGACGGG TATCTGGTCT GAGAGGACGA 
RC90 AGGGTAACGG CCTACCAAGG CTTTGACGGG TATCTGGTCT GAGAGGACGA 

[ 401 450 ] 

S B R 1 0 2 4 TCAGCCAC AC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
SBR1 0 1 5TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
GC86 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
SBR2 04 6TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC25 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC19 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
SBR2 0 1 6TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC7 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC14 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC99 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC11 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC73 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 
RC90 TCAGCCACAC TGGCACTGCG ACACGGGCCA GACTCCTACG GGAGGCAGCA 



I 301 
SBR1024 . 
SBR1015 . 

GC86 
SBR2046 . 

RC25 

RC19 
SBR2016 . 

RC7 

RC14 

RC99 

RC11 

RC73 

RC90 
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[ 451 

S B R 1 0 2 4 GTAAGGAATA TTGCGCAATG 

SBR1015GTAAGGAATA TTGCGCAATG 
GC86 GTAAGGAATA TTGCGCAATG 

SBR204 6 GTAAGGAATA TTGCGCAATG 
RC2 5 GTAAGGAATA TTGCGCAATG 
RC19 GTAAGGAATA TTGCGCAATG 

S B R 2 0 1 6 GTAAGGAATA TTGCGCAATG 
RC7 GTAAGGAATA TTGCGCAATG 
RC14 GTAAGGAATA TTGCGCAATG 
RC99 GTAAGGAATA TTGCGCAATG 
RC11 GTAAGGAATA TTGCGCAATG 
RC73 GTAAGGAATA TTGCGCAATG 
RC90 GTAAGGAATA TTGCGCAATG 
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500 ] 

GGC.GACAGC CTGACGCAGC NACGCCGCGT 
GGC.GACAGC CTGACGCAGC NACGCCGCGT 
GGC.GACAGC CTGACGCAGC NACGCCGCGT 
GGC.GACAGC CTGACGCAGC GACGCCGCGT 
GGC . GAAAGC CTGACGCAGC NACGCCGCGT 
GGC. GAAAGC CTGACGCAGC GACGCCGCGT 
GGC. GAAAGC CTGACGCAGC NACGCCGCGT 
GGC. GAAAGC CTGACGCAGC NACGCCGCGT 
GGC. GAAAGC CTGACGCAGC NACGCCGCGT 
GGC. GAAAGC CTGACGCAGC CACGCCGCGT 
GGC. GAAAGC CTGACGCAGC CACGCCGCGT 
GGC . GAAACC CNGACGCAGC CACGCCGCGT 
GGC . GAAACC CNGACGCAGC CACGCCGCGT 



[ 501 

S B R 1 0 2 4 GGGGG ATGAA GGTC . TTCGG 

SBR1 0 1 5 GGGGG ATGAA GGTC . TTCGG 
GC86 GGGGGATGAA GGTC. TTCGG 

SBR2 0 4 6 TGGGG ATGAA AGTC . TTCCG 
RC2 5 GGGGGATGAA GGTC. TTCGG 
RC19 GGGGGATGAA GGTC. TTCGG 

SBR2 0 1 6 GGGGGATGAA GGTC . TTCGG 
RC7 GGGGGATGAA GGTC. TTCGG 
RC14 GGGGGATGAA GGTC. TTCGG 
RC 9 9 GGGGGATGAA GGTC . TTCGG 
RC11 GGGGGATGAA GGTC. TTCGG 
RC73 GGGGGATGAA GGTC . TTCGG 
RC90 GGGGGATGAA GGTC. TTCGG 



550 ] 

ATTGTAAACC CCTTTCGGCA GGGAAGATGG 
ATTGTAAACC CCTTTCGGCA GGGAAGATGG 
ATTGTAAACC CCTTTCGGCA GGGAAGATGG 
ATTGTAAACC CCTTTCCGCA GGGAAGATGG 
ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
ATTGTAAACC CCTTTCGGGA GGGAAGATGG 
ATTGTAAACC CCTTTCGGGA GGGAAGATGG 



[ 551 500 ] 

SBR1024AACGG GTAA CCGTTCG GACGGTACCT G C AGAAGC AG 

SBR1 0 1 5AACGG GTAA CCGTTCG GACGGTACCT GCAGAAGCAG 

GC86 AACGG GTAA CCGTTCG GACGGTACCT GCAGAAGCAG 

SBR2 04 6 AACGG .GTAA CCGTTCG GACGGTACCT GCAGAAGCAG 

RC2 5 AGCGA GCAA TCGTTCG GACGGTACCT CCAGAAGCAG 

RC19 AGCCA GCAA TCGTTCG GACGGTACCT CCAGAAGCAG 

SBR2 01 6 AGCGA GCAA TCGTTCG ' GACGGTACCT CCAGAAGCAG 

RC7 AGCGA GCAA TCGTTCG GACGGTACCT CCAGAAGCAG 

RC14 AGCGA GCAA TCGTTCG GACGGTACCT CCAGAAGCAG 

RC99 AGCGA GCAA TCGTTCG GACGGTACCT CCAGAAGCAG 

RC11 AGCGA GCAA TCGTTCG GACGGTACCT CCAGAAGCAG 

RC73 AACGA GCAA TCGTTCG GACGGTACCT CCAGAAGCAG 

RC90 AACGA. .... .GCAA TCGTTCG GACGGTACCT CCAGAAGCAG 
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[ 601 

SBR1024CCACGGCTAA CTTCGTGCCA 

SBR101 5CCACGGCTAA CTTCGTGCCA 
GC86 CCACGGCTAA CTTCGTGCCA 

SBR2 04 6CCACGGCTAA CTTCGTGCCA 
RC2 5 CCACGGCCAA CTTCGTGCCA 
RC19 CCACGGCCAA CTTCGTGCCA 

S BR2 016 CCACGGCCAA CTTCGTGCCA 
RC7 CCACGGCCAA CTTCGTGCCA 
RC14 CCACGGCCAA CTTCGTGCCA 
RC99 CCACGGCCAA CTTCGTGCCA 
RC11 CCACGGCCAA CTTCGTGCCA 
RC73 CCACGGCCAA CTTCGTGCCA 
RC90 CCACGGCCAA CTTCGTGCCA 

[ 651 

S B R 1 0 2 4 TTGTTCGGAT TTACTGGGCG 

SBR1 0 1 5TTGTTCGGAT TTACTGGGCG 
GC86 TTGTTCGGAT TTACTGGGCG 

SBR2 04 6TTGTTCGGAT TTACTGGGCG 
RC2 5 TTGTTCGGAT TCACTGGGCG 
RC19 TTGTTCGGAT TCACTGGGCG 

S B R2 016 TTGCTTGGAT TCACTGGGCG 
RC7 TTGTTCGGAT TCACTGGGCG 
RC14 TTGTTCGGAT TCACTGGGCG 
RC99 TTGTTCGGAT TCACTGGGCG 
RC11 TTGTTCGGAT TCACTGGGCG 
RC73 TTGTTCGGAT TCACTGGGCG 
RC90 TTGTTCGGAT TCACTGGGCG 

[ 701 

SBR1024TCCGTGAAAT CTCCGGGCCT 

SBR1015TCCGTGAAAT CTCCGGGCCT 
GC86 TCCGTGAAAT CTCCGGGCCT 

SBR2046TCCGTGAAAT CTCCGGGCCT 
RC2 5 TCTGTTAAAG CTTCGGGCCC 
RC19 TCTGTTAAAG CTTCGGGCCC 

SBR2 0 16TCTGTTAAAG CTTCGGGCCC 
RC7 TCTGTTAAAG CTTCGGGCCC 
RC14 TCTGTTAAAG CTTCGGGCCC 
RC99 TCTGTTAAAG CTTCGGGCCC 
RC11 TCTGTTAAAG CTTCGGGCCC 
RC73 TCTGTTAAAG CTTCGGGCCC 
RC90 TCTGTTAAAG CTTCGGGCCC 
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650 

GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 
GCAGCCGCGG TAATACGAAG GTGGCAAGCG 

700 

TACAGGGAGC GTAGGCGGTT GGGTAAGCCC 
TACAGGGAGC GTAGGCGGTT GGGTAAGCCC 
TACAGGGAGC GTAGGCGGTT GGGTAAGCCC 
TACAGGGAGC GTAGGCGGTT GGGTAAGCCC 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTANGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 
TACAGGGTGT GTAGGCGGTT TGGTAAGCCT 

750 

AACCCGGAAA GTGCGGAGGG GACTGCTCGG 
AAC C CGG AAA GTGCGGAGGG GACTGCTCGG 
AACCCGGAAA GTGCGGAGGG GACTGCTCGG 
AACCCGGAAA GTGCGGAGGG GACTGCTCGG 
AACCCGGAAA GCGCAGACGG TACTGCCAGG 
AACCCGGAAA GCGCAGAGGG TACTGCCAGG 
AACCCGAAAA GCGCAGAGGG TACTGCCAGG 
AACCCGGAAA GCGCAGGGGG TACTGCCAGG 
AACCCGGAAA GCGCAGAGGG TACTGCCAGG 
AACCCGGAAA GCGCAGAGGG TACTGCCAGG 
AACCCGGAAA GCGCAGAGGG TACTGCCAGG 
AACCCGGAAA GCGCAGAGGG TACTGCCAGG 
AACCCGGAAA GCGCAGAGGG TACTGCCAGG 
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t 751 800 ] 

SB R 1 0 2 4 CTAGAGGATG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

SBR1 0 1 5CTAGAGGATG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

GC86 CTAGAGGATG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

S B R 2 0 4 6 CTAGAGGATG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

RC2 5 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

RC19 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

S BR 2 016 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

RC7 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

RC14 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

RC99 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

RC11 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

RC73 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

RC9 0 CTAGAGGGTG GGAGAGGAGC GCGGAATTCC CGGTGTAGCG GTGAAATGCG 

[ 801 850 ] 

SBR 1 024 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATTTC 

SBR1 0 1 5TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATTTC 

GC8 6 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATTTC 

S B R 2 0 4 6 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATTTC 

RC2 5 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATACC 

RC19 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATGCC 

S B R 2 0 1 6 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATACC 

RC7 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATACC 

RC14 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATACC 

RC9 9 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATACC 

RC11 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATACC 

RC7 3 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATACC 

RC9 0 TAGAGATCGG GAGGAAGGCC GGTGGCGAAG GCGGCGCTCT GGAACATACC 

t 851 900 ] 

SBR 1 0 2 4 TGACGCTGAG GCTCGAAAGC GTGGGGAGCA AACAGGATTA GAT AC C CTGG 
S B R 1 0 1 5 TG ACG CTGAG GCTCGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
GC86 TGACGCTGAG GCTCGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
S B R2 04 6 TGACGCTGAG GCTCGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
RC2 5 TGACGCTGAG ACACGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
RC19 TGACGCTGAG ACACGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
S B R 2 0 1 6 TGACG CTGAG ACACGAAAAC GTGGGGAGCA AACAGGATTA GATACCCTGG 
RC7 TGACGCTGAG ACACGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
RC14 TGACGCTGAG ACACGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
RC9 9 TGACGCTGAG ACACGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
RC11 TGACGCTGAG ACACGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
RC73 TGACGCTGAG ACACGAAAGC GTGGGGNGCA AACAGGATTA GATACCCTGG 
RC90 TGACGCTGAG ACACGAAAGC GTGGGGAGCA AACAGGATTA GATACCCTGG 
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t _ 901 

S B R 1 0 2 4 TAGTCC ACGC CTTAAACGAT GGATACTAAG TGTCGGCGG 

S BR 1 0 1 5 TAGTCCACGC CTTAAACGAT GGATACTAAG TGTCGGCGG 

GC8 6 TAGTCCACGC CTTAAACGAT GGATACTAAG TGTCGGCGG 

SBR2 04 6TAGTCCACGC CTTAAACGAT GGATACTAAG TGTCGGCGG 

RC2 5 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG 

RC19 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG 

SBR2 016TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG 

RC7 TAGTCCACGC C CTAAGCT AT GGATACTAAG TGTCGGCGG 

RC14 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG 

RC99 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG 

RC11 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG 

RC7 3 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG 

RC9 0 TAGTCCACGC CCTAAACTAT GGATACTAAG TGTCGGCGG 

[ 951 1000 

SBR1024 G TTA CCGCCGGTG CCG CAGCTAA 

SBR1015 G TTA CCGCCGGTG CCGCAGCTAA 

GC8 6 G TTA CCGCCGGTG CCGCAGCTAA 

SBR2 046 G TTA CCGCCGGTG CCGCAGCTAA 

RC25 G TTA CCGCCGGTG CCGCAGCTAA 

RC19 G TTA .* CCGCCGGTG CCGCAGCTAA 

SBR2 016 G TTA CCGCCGGTG CCGCAGCTAA 

RC7 G TTA CCGCCGGTG CCGCAGCCAA 

RC14 G TTA CCGCCGGTG CCGCAGCTAA 

RC " G TTA CCGCCGGTG CCGCAGCTAA 

rc11 g tta ccgccggtg ccgcagctaa 

rc73 g tta ccgccggtg ccgcagctaa 

rc90 g tta .ccgccggtg ccgcagctaa 

[ iooi 1050 

SBR1 0 2 4 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

SBR 1 0 1 5 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
GC86 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

SBR2 04 6CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
RC2 5 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
RC19 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 

SBR2 0 1 6 CGCATTAAGT ATCCCGCCTG GGAGGTACGG CCGCAAGGTT GAAACTCAAA 
RC7 CGCGTTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
RC14 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
RC99 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
RC11 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
RC73 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
RC90 CGCATTAAGT ATCCCGCCTG GGAAGTACGG CCGCAAGGTT GAAACTCAAA 
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[ 1051 1100 ] 

S BR 1 0 2 4 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
SBR1 0 1 5GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
GC8 6 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
S B R 2 0 4 6 GGAATTGACG GGGCCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
RC2 5 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
RC19 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
SBR2 0 1 6 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCTTGTGGT TTAATTCGAC 
RC7 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
RC14 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
RC99 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
RC11 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 
RC7 3 GGGATTGACG GGGGCCCGCA CAAGCGGTGG GGCATGTGGT TTAATTCGAC 
RC90 GGAATTGACG GGGGCCCGCA CAAGCGGTGG AGCATGTGGT TTAATTCGAC 



[ 1101 1150 ] 

SBR1024GCAACGCGAA GAACCTTA.C CCAGGCTGGA CATG CAGGTAG 

SBR1 0 1 5GCAACGCGAA GAACCTTA.C CCAGGCTGGA CATG CAGGTAG. 

GC8 6 GCAACGCGAA GAACCTTA.C CCAGGCTGGA CATG CAGGTAG 

SBR2 04 6GCAACGCGAA GAACCTTA.C C CAGGCAGG A CATG CAGGTAG 

RC2 5 GCAACGCGAA GAACCTTA.C CCAGGTTGGA CATG CACGTAG 

RC19 GCAACGCGAA GAACCTTA.C CCAGGTTGGA CATG CACGTAG 

SBR2016GCAACGCGAA GAACCTTA.C CCAGGTTGGA CATG CACGTAG 

RC7 GCAACGCGAA GAACCTTA.C CCAGGTTGGA CATG CACGTAG 

RC14 GCAACGCGAA GAACCTTA.C CCAGGTTGGA CATG CACGTAG 

RC9 9 GCAACGCGAA GAACCTTA.C CCAGGTTGGA CATG CACGTAG 

RC11 GCAACGCGAA GAACCTTA.C CCAGGTTGGA CATG CACGTAG 

RC73 GCAACGCGAA GAACCTTA.C CCAGGTTGGA CATG CACGTAG 

RC90 GCAACGCGAA GAACCTTA.C CCAGGTTGGA CATG CACGTAG 

[ 1151 1200 ] 

SBR1 0 2 4 T AG AAGGGT . . GAAA . . GCC TAACGAGGTA GCAA TACCAT 

SBR1 0 1 5 TAG AAGGGT . . GAAA . . GCC TAACGAGGTA GCAA TACCAT 

GC 8 6 TAGAAGGGT . . GAAA . . GCC TAACGAGGTA GCAA CACCAT 

S BR2 04 6 TAGAAGGGT . . GAAA . . GCC TAACGAGGTA GCAA TACCAT 

RC2 5 TAGAAAGGT . . GAAA . . GCC TGACGAGGTA GCAA TACCAG 

RC 1 9 TAGAAAGGT . . GAAA . . GNC TAACGAGGTA GCAA TACCAG 

SBR2 0 1 6 TAGAAAGGT . . GAAA . . GCC TGACGAGGTA GCAA TACCAG 

RC7 TAGAAAGGT. . GAAA .. GCC TGACGAGGTA GCAA TACCAG 

RC14 TAGAAAGGT . . GAAA . . GCC TGACGAGGTA GCAA TACCAG 

RC 9 9 TAGAAAGGT . . GAAA . . GCC TGACGAGGTA GCAA TACCAG 

RC11 TANAAAGGTA .GAAA.. GCC TGACGAGGTA GCAA TACCAG 

RC7 3 TNGAAAGGT. . GAAA . . GCC TGACGAGGTA GCAA TACCAG 

RC 9 0 TAGAAAGGT . . GAAA . . GCC TGACGAGGTA GCAA TACCAG 
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[ 1201 

SBR1024CCTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG ] 

SBR1015CCTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCCTGCCG^ SSSSS 
GC86 CCTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTCtTgG 

SBR2 04 6CCTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG^ 
RC2 5 CGTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG 
RC19 CGTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG 

SBR2 0 1 6 CGTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG 
RC7 CGTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG 
RC14 CGTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG 
RC99 CGTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG 
RC11 CGTGCTCAGG TGCTGCATGG CTGTCTTCAG CTCGTGCCGT GAGGTGTTGG 
RC73 CGTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG 
RC90 CGTGCTCAGG TGCTGCATGG CTGTCGTCAG CTCGTGCCGT GAGGTGTTGG 

[ 1251 

S BR1 024 GTTAAGTCCC GCAACGAGCG CAACCCCTGT CTTCAGTTAC CAACGG 13 °° 3 

SBR1 0 1 5GTTAAGTCCC GCAACGAGCG CAACCCCTGT CTTCAGTTAC CAACGG 
GC86 GTTAAGTCCC GCAACGAGCG CAACCCCTGT CTTCAGTTAC CAACGG " 

SBR2 04 6GTTAAGTCCC GCAACGAGCG CAACCCCTGT CTTCAGTTAC CAACGG " 
RC2 5 GTTAAGTCCC GCAACGAGCG CAACCCCTGC TTTCAGTTGC TACCGG " 
RC19 GTTAAGTCCC GCAACGAGCG CAACCCCTGC TTTCAGTTGC TACCGG " 

SBR2 0 1 6 GTTAAGTCCC GCAACGAGCG CAACCCCTGC TTTCAGTTGC TACCGG " 
RC7 GTTAAGTCCC GCAACGAGCG CAACCCCTGC TTTCAGTTGC TACCGG 
RC14 GTTAAGTCCC GCAACGAGCG CAACCCCTGC TTTCAGTTGC TACCGG " 
RC99 GTTAAGTCCC GCAACGAGCG CAACCCCTGC TTTCAGTTGC TACCGG ' 
RC11 GTTAAGTCCC GCAACGAGCG CAACCCCTGC TTTCAGTTGC TACCGG*" 
RC73 GTTAAGTCCC GCAACGAGCG CAACCCCTGC TTTCAGTTGC TACCGG ' 
RC90 GTTAAGTCCC GCAACGAGCG CAACCCCTGC TTTCAGTTGC TGCCGG ." * 

[ 1301 

SBR1024GTCATG. . . . CCGGGAACTC TGGAGAGACT GCCCAGGAGA ACGGG GAGG 3 
SBR1015GTCATG. . . . CCGGGAACTC TGGAGAGACT GCCCAGGAGA ACGGGGGAGG 
<^L^™2- * * • CCGGGAACTC TGGAGAGACT GCCCAGGAGA ACGGG. GAGG 
SBR2046GTCATG. . . . CCGGGAACTC TGGAGAGACT GCCCAGGAGA ACGGG. GAGG 
p^q ' * • ' CCGAGCACTC TGAAAGGACT GCCCAGGATA ACGGG. GAGG 

RC19 GTCATG. . . . CCGAGCACTC TGAAAGGACT GCCCAGGATA ACGGG GAGG 
S ^f 16 °I CATG ---- ^GAGCACTC TGAAAGGACT GCCCAGGATA ACG^.IagI 
RC14 r™^' • * ■ CCGAGCACTC TGAAAGGACT GCCCAGGATA ACGGGGGAGG 
Irlt 2™™' * ' " CCGAGCACTC TGAAAGGACT GCCCAGGATA ACGGG. GAGG 
RC11 rZnZn' * • * CCGAGCACTC TGAAAGGACT GCCCAGGATA ACGGGGAAGG 
Icil S^---- CCGAACACTC TGAAAGGACT GCCCAGGATA ACGGGGAAGG 
" C G I^- " * • CGAACACTC TGAAAGGACT GCCCAGGATA ACGGGGAAGG 

RC90 GTCATG. . . . CCGAACACTC TGAAAGGACT GCCCAGGATA ACGGGGAAGG 
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[ 1351 

S B R 1 0 2 4 AAGGTGGGGA TGACGTCAAG 

S B R 1 0 1 5 AAGGTGGGGA TGACGTCAAG 
GC8 6 AAGGTGGGGA TGACGTCAAG 

S BR2 0 4 6 AAGGTGGGGA TGACGTCAAG 
RC2 5 AAGGTGGGGA TGACGTCAAG 
RC19 AAGGTGGGGA TGACGTCAAG 

S B R2 0 1 6 AAGGTGGGGA TGACGTCAAG 
RC7 AAGGTGGGGA TGACGTCAAG 
RC14 AAGGTGGGGA TGACGTCAAG 
RC99 AAGGTGGGGA TGACGTCAAG 
RC11 AAGGTGGGGA TGACGTCAAG 
RC73 AAGGTGGGGA TGACGTCAAG 
RC9 0 AAGGTGGGGA TGACGTCAAG 



14/17 

1400 ] 

TCAGCATGGC CTTTATGCCT GGGG C C AC AC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 
TCAGCATGGC CTTTATACCT GGGGCCACAC 
TCAGCATGGC CTTTATGCCT GGGGCCACAC 



[ 1401 

S B R 1 0 2 4 ACGTGCTACA ATGGCCGGTA 

SBR1 0 1 5ACGTGCTACA ATGGCCGGTA 
GC86 ACGTGCTACA ATGGCCGGTA 

SBR2 04 6 ACGTGCTACA ATGGCCGGTA 
RC25 ACGTGCTACA ATGGCCGGTA 
RC19 ACGTGCTACA ATGGCCGGTA 

SBR2 0 1 6 ACGTGCTACA ATGGCCGGTA 
RC7 ACGTGCTACA ATGGCCGGTA 
RC14 ACGTGCTACA ATGGCCGGTA 
RC99 ACGTGCTACA ATGGCCGGTA 
RC11 ACGTGCTACA ATGGCCGGTA 
RC73 ACGTGCTACA ATGGCCGGTA 
RC90 ACGTGCTACA ATGGCCGGTA 

[ 1451 

SBR1024CAATCCCAAA AAACCGGCCT 

S B R 1 0 1 5 C AAT CG C AAA AAACCGGCCT 
GC86 C AAT CG C AAA AAACCGGCCT 

SBR2 04 6 CAATCGCAAA AAACCGGCCT 
RC2 5 CAATCGCAAA AAACCGGCCT 
RC19 CAATCGCAAA AAACCGGCCT 

SBR2 016 CAATCGCAAA AAACCGGCCT 
RC7 CAATCGCAAA AAACCGGCCT 
RC14 CAATCGCAAA AAACCGGCCT 
RC99 CAATCGCAAA AAACCGGCCT 
RC11 CAATCGCAAA AAACCGGCCT 
RC7 3 CAATCGCAAA AAACCGGCCT 
RC9 0 CAATCGCAAA AAACCGGCCT 



1450 ] 

CAAAGCGCTG CAAACCC.GT AAGGGGG AG C 
CAAAGCGCTG CAAACCC.GT AAGGGGGAGC 
CAAAGCGCTG CAAACCC.GT AAGGGGGAGC 
CAAAGCGCTG CAAACCC . GT AAGGGGGAGC 
CAAAGCGCTG CAAACCC.GT GAGGGGGAGC 
CAAAGCGCTG CAAACCC.GT GAGGGGGAGC 
CAAAGCGCTG CAAACCC.GT GAGGGGGAGC 
CAAAACGCTG CAAACCC.GT GAGGGGGAGC 
TAAAACGCTG CAAACCC.GT GAGGGGGAGC 
CAAAACGCTG CAAACCC.GT GAGGGGGAGC 
CAAAGCGCTG CAAACCC.GT GAGGGGGAGC 
CAAAACGCTG CAAACCC.GT GAGGGGGAGC 
CAAAACGCTG CAAACCC.GT GAGGGGGAGC 

1500 ] 

CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCAGAT TGAGGTCTGC AACTCGACCT 
CAGTTCANAT TGAGGTCTGC AACTCGACCT 
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SBR 1 0 2 4 CATGAAGGCG GAATCGCTAG TAATCCCGGA TCAG . CACGC CGGGGTGAAT 

SBR1 0 1 5CATGAAGGCG GAATCGCTAG TAATCCCGGA TCAG . CACGC CGGGGTGAAT 
GC86 CATGAAGGCG GAATCGCTAG TAATCCCGGA TCAG . CACGC CGGGGTGAAT 

SBR2 04 6 CATGAAGGCG GAATCGCTAG TAATCCCGGA TCAG . CACGC CGGGGTGAAT 
RC2 5 CATGAAGGCG GAATCGCTAG TAATCG CGG A TCAG . CACGC CGCGGTGAAT 
RC19 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 

SBR2 016 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG. CACGC CGCGGTGAAT 
RC7 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG. CACGC CGCGGTGAAT 
RC14 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC99 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC11 CATGAAGGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 
RC73 CATGAATGCG GAATCGCTAG TAATCGCGGA TCAG. CACGC CGCGGTGAAT 
RC90 CATGAATGCG GAATCGCTAG TAATCGCGGA TCAG . CACGC CGCGGTGAAT 

C 1551 1600 ] 

S B R 1 0 2 4 ACGTT CC CGG GCCTTGTACA CACCGCCCGT CACACCACGA AAG TTTGTTG 
SBR101.5ACGTTCCCGG ACCTTGTACA CACCGCCCGT CACACCACGA AAG TTTGTTG 
GC86 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAG TTTGTTG 
SBR2 04 6ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAG TTTGTTG 
RC2 5 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAG CCTGTTG 
RC19 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
SBR2 0 1 6ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC7 ACGTTCCCGG GCCTTGTGCA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC14 ACGTTCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC9 9 ACGTNCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC11 ACGTNCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC73 ACGTNCCCGG GCCTTGTACA CACCGCCCGT CACACCACGA AAGCCTGTTG 
RC90 ACGTNCCCGG GCCTTGTACA CGCCGCCCGT CACACCACGA AAGCCTGTTG 

[ 1601 

SBR1 0 2 4 TACCTGAAGT CGTTGGCG CC AACC 
SBR1 0 1 5TACCTGAAGT CGTTGGCGCC AACC 

GC8 6 TACCTGAAGT CGTTGGCGCC AACC 
SBR2 04 6TACCTGAAGT CGTTGGCGCC AACC 

RC2 5 TACCTGAAGT CGCCCAAGCC AACC 

RC19 TACCTGAAGT CGCCCAAGCC AACC 
SBR2 0 1 6 TACCTGAAGT CGCCCAAGCC AACC 

RC7 TACCTGAAGT CGCCCAAGCC AACC 

RC14 TACCTGAAGT CGCCCAAGCC AACC 

RC9 9 TACCTGAAGT CGCCCAAGCC AACC 

RC11 TACCTGAAGT CGCCCAAGCC AACC 

RC7 3 TACCTGAAGT CGCCCAAGCC AACC 

RC90 TACCTGAAGT CGCCCAAGCC AACC 

Fig. 8 (continued) 



1650 ] 

GCAA GGAGGCAGAC 

GCAA GGAG 

GCAA GGGGGCAGAC 

GCAA GGAGGCAGAC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GAAGG CAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCAGGC 

GCAA GGAGGCANGC 
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[ 1651 1700 ] 

SBR1024GCCCACGGTA TGACCGATGA TTGGG 

SBR1015 

GC8 6 GCCCACGGTA TGACCGATGA TTGGGGTGAA GTCGTAACAA GGTAACCGTA 

SBR2 0 4 6GCCCACGGTA TGACCGATGA TTGGGG 

RC25 GCCCACGGTA TGGC CCGTGA TTGGGGTGAA GTCGTAACAA GGTAACCGTA 

RC19 GCCCACGGTA TGGCCGGTGA TTGGGGTGAA GTCCTAACA - 

SBR2 0 1 6GCCCACGGTA TGGC 

RC7 GCCCACGGTA TGGCCG 

RC14 GCCCACGGTA TGGCCGGTGA T 

RC99 GCCCACGGTA TGGCCGGTGA 

RC11 GCCCACGGTA TGGCCGGTGA TGGGG 

RC73 GCCCACGGTA TGGCCGGTGA TGGGG 

RC90 GCCCACGGTA TGGCCGGTGA TG. . . 

[ 1701 1750 ] 

SBR1024 

SBR101S 

GC86 ATC- 

SBR2046 

RC2 5 :AA-~-: 

RC19 

SBR2016 

RC7 

RC14 

RC99 

RC11 

RC73 

RC90 
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